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The first consumer Internet appliances were disappointments, but does that 
mean future generations will suffer a similar fate? 
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□Id Cat, New Tricks 



I've promised to be completely honest with you in this 

column, for better or for worse, so let me come clean about 
this: I liked the CueCat. Despite its many flaws and all of 
the bad press, the ugly little barcode reader captured my 
attention in a way that gizmos these days rarely do. Of 
course DigitalConvergence, the company behind the 
CueCat, nearly folded earlier this year after spending $250 
million to distribute 10 million of the cat-shaped devices for 
free. But the barcode scanners themselves are still around—I 
have one in my desk drawer—and I’m still caught up in all 
that they stood for. 

When Forbes magazine first sent out over 800,000 
CueCats with the expectation that subscribers would plug 
them in to their computers and use them to scan special 
barcodes on magazine ads, the critics couldn’t stop shout¬ 
ing. It’s a case of a solution without a problem, they said. 
Then, the complaints began anew when smart consumers 
discovered that, with each scan, the CueCat software trans¬ 
mitted a unique identifier to DigitalConvergence’s servers, 
which enabled the company to track the habits of individual 
users. This raised the ire of privacy advocates, and develop¬ 
ers continued to modify and disable the tracking software 
with renewed cause. Finally, the grumbling reached a 
crescendo when hackers revealed a batch of users’ email 
addresses insecurely stored on the company’s servers. 

I won’t defend DigitalConvergence or its practices. The 
company made poor decisions: it tried to introduce a 
proprietary barcode format, the device barely worked, the 
license initially ignored privacy concerns, the software was 
full of unwanted advertising, and the actual product 
name-spelled :CueCat—was cringeworthy. Yet, the concept 
was true genius. 

CueCat was designed to let users scan any item with a 


information and carry out specific computing tasks. The 
iat:er group, called Internet appliances, is well poised to 
outnumber desktop computers in the near future. That’s 
not hard to imagine if you consider that the next genera¬ 
tion of radios, televisions, copiers, and other household 
and business appliances will be Internet-enabled. Whether 
used for communication, computation, or entertainment, 
the appliances will contain enough processing power and 
memory to host a TCP/IP stack and even an embedded 
database. With the proliferation of such devices, it’s unfor¬ 
tunate that CueCat’s parents referred to themselves as 
DigitalConvergence: the industry is in fact seeing increas¬ 
ing divergence due to the ability to network products. 

And what about the claim that these Net-enabled 
devices are just technical solutions without real world 
problems? Would CueCat owners really have grown as 
comfortable using it as they would a mouse? Did we learn 
nothing from the failure of ^Corn’s Audrey, a simplified 
browsing device, and from NetPliance’s l-opener Internet 
terminal? It’s important to remember that not all solutions 
stem from a direct problem. There’s a genera! need for 
information, and as a result, perhaps Internet appliances 
are: better termed “information appliances.” 

In the midst of the current downturn sensationalism, 
it’s easy to overlook the fact that a company’s failure isn’t 
necessarily the result of a product concept failure. Several 
factors contribute to a company’s success, including 
management, sales, and marketing. As is evident, 
DigitalConvergence failed in many of these areas during 
the execution stages of a great concept. At the very least, the 
doors have been opened. Now it’s up to smart developers 
to learn from others’ mistakes and step through with 
devices that work, Until then, l have my CueCat right here 


barcode to immediately download more information about as a reminder of a quality concept, and as inspiration for 
that item into their Web browser. A concept like this would the next iteration of the Internet. >< 
never have been possible without the Internet. Because of 
the sheer number of items with barcodes and the fact that 
product information often changes without notice, it’s 
unrealistic to keep and update copies of a database on 
every consumer’s computer. Instead, the database can 
reside on a server, where it may be changed and added to 
with the guarantee that each client accessing it receives 
the most up to date information. Had the CueCat scanner 
worked better, and had each scan resulted in the retrieval 
of unique, relevant information rather than a standard Web 
page, computer distributors soon might have begun ship¬ 
ping PCs with three input devices: a keyboard, a mouse, 
and a barcode reader. 

The Internet gives rise to a whole new group of devices 
that not only let users input data, but also let them access 
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Analog is the Answer 

About your “Roundup of Web Traffic Analysis 

Tools" {July 2001). Your 
article did nothing to 
answer my sole ques¬ 
tion about logfile 
analyzers: "Why would 
I want to use this 
product instead of 
Analog?" Analog (with 
DNSTran and Report 
Magic) handles pretty 
much everything you listed for nonenterprise- 
level analysis. 

Francis Uy 

Web Coordinator 

fuyi@jumbc.edu 

www.jhu.edu/gifted/cde/ 410-516-0162 

Analog is a core part of any Web site admin’s 
toolset, and the magazine has covered it exten¬ 
sively in past articles. For an example, please 
take a look at “Knee-Deep in Log Files,” by Phil 
Clatz {July 1999). The product review was 
designed specifically to introduce solutions 
available in the enterprise-level space, which 
has a different set of requirements, such as 
reporting and high quality graphical output. 
Mana Tominaga 
Associate Products Editor 

I | 

i 

Security Through Obscurity 

just read Al William’s article on steganography. 

I cannot count how many times, in the InfoSec 
field, Tve had hard core techies turn up their 
collective noses at any implementation of 
‘‘security through obscurity.” Isn’t it humorous, 
however, how quickly those same people will 
embrace steganography? 

Rich Bryant 

Author of Unix Security for the Organization 
rbryant@jiseek.com 
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[ In Your Prime 

I enjoyed Amit Asaravala’s comments about 
encryption in "Home Page" (Aug. 2001). How¬ 
ever, I believe he was using the word “prime” 
incorrectly in a couple of places. For example, 
he states, “If someone asked you to factor the 
primes of 437...” I believe he should have said: 
“If someone asked you to find the prime fac¬ 
tors of 437...” You see, primes can’t be factored 
(by definition); factoring a number all the way 
obtains primes. A little later, he says, “mathe¬ 
maticians can’t prove that a shortcut equation 


for factoring primes...” It’s impossible to factor 
primes; he probably meant “mathematicians 
can’t prove that a shortcut equation for 
producing the prime factors of...” 

Joe Biegen 
JBiegen@jstny.rr.com 

Mea culpa! My engineering side must have 
given way to my editing side for a moment. 
Thanks for the heads up. 

Amit Asaravala 
Editor in Chief 

Hated It! 

I thought Molly Holzschlag’s “Freedom in Struc 
lure” was in terrible need of cutting, tt took 
multiple sentences to get simple points across. 
And those were not much worth reading. After 
the obvious (the Web is not like print or 1V) and 
the platitudes (“Think with both sides of your 
brain”), there isn’t much there. 

if different cultures vary in their degrees of 
linear thinking, and it has something to do with 
the Web, I didn’t get it from the article. Most 
people accessing the Web (at least in the work 
I’m doing) are firmly entrenched in the Western 
tradition. If there’s something to learn from 
non-linear world views, what is it?The fa :t is. 
we all (even the most Western of us) think non- 
linearly. So, what was the point? 

Nelson Stubbins 
Engineering Training Manager 
Wind River Systems 
nelson @jwind river.com 

Loved It! 

I say hurrah to Molly Holzschlag’s article “Clean 
Up, Flatten Out” (August 2001). I am so bored 
of the glitz that’s used everywhere, that it 
turns me off. I really appreciate a simple Web 
site that’s unpretentious, yet presents its 
material in a dean manner. 

I agree that the Web isn’t just another TV 
and we shouldn’t treat it as such. Let’s learn to 
make effective use of the new communications 
media and not abuse it. Thanks for the fresh 

i 

view! 

Dave Pfaltzgraff 
davepfzgjfred.net 

Designing Government 
| Intranets 

"A Solid Intranet in Eight Steps” (July 2001) was 
very useful and informative. I’ve been newly 
assigned to work on changing our county's 


intranet and I’m wondering if you could suggest 
some good books or magazines on the subject 
of government intranets? 

Eve Tobias 

Executive Assistant to the CIO 
etobias@jcharlestoncounty.org 

The most critical obstacle facing government 
internets and intranets these days is accessi¬ 
bility. I refer to a few articles on my home 
page: www.webreference.com/new/ 

010621. htmlttfeatu re. 

Also, here are some articles on measuring 
the return on investment in intranets: 
www.intranetjournal.com/articles/200i04/ii_0 
4_25_01a.html and www.intranetjournal.com/ 
articles/200io6/pin_o6_20_oia.htm!, 

Theo Mandel 

Fire Your Ad Agency 

You’re right on the money in “Fire Your Ad 
Agency” in the June issue. Ads get more intru¬ 
sive as time goes on. Now you have almost 
screen-size ads popping up. There are times 
when I do click on an ad, but that’s primarily 
because something about the ad caught my 
attention. I haven’t done that in a while because 
I instinctively close ads before really looking at 
them, assuming that they’re just annoying. 

Companies need to spend more time devel¬ 
oping better ads instead of on positioning and 
the like. But why deal with the real problem 
when dealing with superfluous things will give 
the illusion that you’re fixing the problem? 

Bill Cunningham 
reggie@jpursuingthetruth.org 

\ 

Cover Art 

It’s obvious that you go out of your way to get 
great cover artwork. I've tried to find acknowl¬ 
edgment of the artists you use, but couldn’t 
find it. Do you show the credits somewhere in 
your publication? If so. where? Great publica¬ 
tion. Excellent format and interesting articles. 
Dave Russell 

David.S.Russell@jstate.or.us 

You’ll find artist credits in the lower left 
corner of our table of contents. This month’s 
illustrator is Tiffany Larsen; she’s also done 
work for ChickClick.com and FOX Broadcast¬ 
ing. we hope you like her artwork as much as 
we do. 

Richard Koscher 
Art Director 
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WEBMASTER’S 
! DOMAIN 


How is the Microsoft monopoly hurting you? New licensing policies could hit 
harder than you expect. 



Monopoly Power 


Lincoln 

Stein 


The U. 5 . Federal Court of Appeals has issued its decision 

on the Un/ted Stores vs. A/licrosoft antitrust trial: Microsoft is 
a monopoly, and has abused its power. However, the 
appeals court has stated that the punishment imposed by 
U. S, District Court Judge Pentieid Jackson doesn’t fit the 
crime. Now a lower court or a settlement agreement will 
decide what the proper punishment should be. 

In the meantime, Microsoft has removed the restrictive 
language from its contracts with PC makers that forbade 
placing competing software icons on the Windows desktop. 
At the same time, the company is readying the October 
release of Windows XP. This version of the operating system 
will substantially increase the amount of integration bet¬ 
ween traditional operating system functions such as file 
management, and traditional application functions like 
multimedia and Web browsing. 

I’ve been watching these events unfold with bemused 


T 


ion jeopardy as well. This loss has major financial implica¬ 
tions, as it means that we'll have to pay five times more for 
all new software and upgrades. 

Later, at a meeting of the lab IS oversight committee, we 
discussed what we could do if Microsoft decides to reclas¬ 
sify us. The sad conclusion: not much. 

fen years ago, there were many choices of word processors: 
XYWrite. WordStar, and Word Perfect, to name a few. Then 
Mi;rosoft Word appeared on the market, and Microsoft played 
its hand very shrewdly. First, it set the retail price for Word at 
$99, much lower than the industry leader, Word Perfect, at 
$400. Second, Word had an extremely good Word Perfect 
import function, but anemic Word Perfect export functional¬ 
ity This made it easy for users to make the transition from 
Word Perfect to Word, but hard to move the other way. 

Once Word had carved out a segment for itself, Microsoft 
cemented its position with a series of version upgrades. In 


detachment. True, I do get annoyed when Microsoft products addition to adding ever more features, each upgrade 

produced files that earlier versions couldn’t read. As soon 
as a significant percentage of people upgraded to the latest 


don’t work as expected. I'm always a bit reluctant to dick on 
the font size menu in my version of PowerPoint because one 


time out of a hundred this action causes the entire system to 
freeze. However, I’ve been pretty satisfied with Microsoft 
products. The operating system comes for free (or seems to) 
whenever I purchase a new PC. And as a member of an 
academic institution I’m entitled to a substantial 80 percent 
discount on Microsoft Office and other products. 

Over the past two weeks, though, I’ve learned two lessons 
about the meaning of monopoly power. Suddenly, the 
antitrust suit got a lot more personal. 


No Discounts for Cancer Researchers? 

s I had my first lesson a couple of weeks ago. My laboratory's 
director of information services (IS) came back shaken from 
a meeting of IT department heads from U.S. cancer research Upgrading 


and greatest version, the rest of the world was forced to 
go along so they could read the documents that early 
adapters were producing. 

Now Microsoft Word costs $340 retail and the competi¬ 
tion is a handful of niche products with small market pene¬ 
tration: Sun’s StarOffice, VistaSource’s Applixware, and 
KOffice, the productivity software that comes bundled with 
the open-source KDE desktop system. These products are all 
capable of handling a typical user’s word processing, 
sp eadsheet, and presentation needs, but none of them are 
perfect when it comes to exchanging files with their 
Microsoft Office counterparts. 


My second lesson was six years in the making. In 1995, l 
bought my first PC, a Dell Dimension P90 with a Pentium 
9c chip and 32MBS of memory, it arrived with Windows 95 
preinstalled along with bundled copies of Microsoft Word, 
Excel, and Access. 

I’d bought the system to experiment with Linux, So my 
first action after plugging in the system was to follow Dell’s 
directions to make a set of system recovery floppies for 
Windows 95 and Office. I threw the write-protect tabs on all 
60-odd floppies, and placed them in a shoebox in the book¬ 
case. I then proceeded to wipe the hard disk clean and 


centers. This year Microsoft quietly changed its definition 
of a qualifying academic institution. The old definition 
encompassed any primary or secondary school, degree¬ 
granting institution, or research laboratory affiliated with a 
degree-granting institution. 

The new definition imposes strict guidelines on the ratio 
of staff to students, and on the proportion of staff directly 
involved with teaching. As a result, Microsoft sales repre¬ 
sentatives have recently informed several biomedical 
research laboratories that they will lose their academic 
discount on Microsoft products effective this fall. The list 
includes several cancer centers, including some of the most install the Slackware version of Linux, which at the time, 
respected research institutions in the nation. My research took up all of n floppies. 

institution (Cold Spring Harbor Laboratory) has recently This Dell served me very well for five years, first as a Linux 

become an accredited university, but it may not meet the learning machine, then as a Web development desktop, and 

strict staffing requirements, putting its academic discount finally as a Web server for my personal home pages and the 
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www.modperl.com site. Over the years I gave it 
several brain transplants, upgrading its mem¬ 
ory, processor, disk, and graphics board. I added 
a tape drive and a sound card. As Internet con¬ 
nectivity options broadened, I added a series of 
networking options: an analog modem card, an 
ISDN card, an ethernet card, and finally a 
PCMCIA bridge so that I could plug in a wireless 
card. As I upgraded the hardware, I upgraded 
Linux. At first I would purchase Linux CDs {$20 
at the bookstore, $5 to $10 at computer fairs), 
but as my Internet connectivity increased, I 
found it easier to download new kernels, appli¬ 
cations, and drivers as I needed them. 

Finally, last winter the machine had outlived 
its usefulness as a Web server, so l replaced it 
with an Athlon-based machine built from spare 
parts that had been donated to the modperl.com 
site. The Dell went into the basement alongside 
the Macintosh dating from the same era. 

But the Dell’s retirement was short-lived. A 
few months later, my father-in-law came to 
stay with my wife and me, and we thought it 
would be nice for him to have a computer of 
his own to browse the Internet. So I resurrected 
the Dell. My father-in-law knows Windows, so I 
had the bright idea of restoring the machine to 
its native Windows 95 state. I dusted off the 
Microsoft shoebox, stacked the disks into a 
tottering tower on my desktop, and started to 
reinstall the system. 

I didn’t get very far. Shortly into the 
install process, Windows refused to proceed 
further, referring me to the end user license 
that only permitted me to reinstall the software 
into the manufacturer’s original equipment. My 
heart sank as I thought back to the processor, 
disk, and memory upgrades. This was not the 
original machine that I had bought from Dell! 

The rest of the story isn’t worth telling in 
detail. Suffice to say that it ended up being 
cheaper to buy a good used computer with 
Windows 98 preinstalled on it than to purchase 
a fresh copy of Windows 98 or Windows ME 
from a retailer. This anecdote will become a 
very familiar one among computer users over 
the next few years as Microsoft's new XP prod¬ 
uct licensing policies go into effect. 

For several years now, many of the system 
recovery disks distributed with new PCs have 
been configured to allow system installation on 
only the manufacturer’s original equipment. Yet, 
the operating system and applications were 
installed, it was easy to upgrade disks, memory, 
video boards, and other hardware. Neither 
Windows nor Microsoft's various applications 
have checked to determine whether they’re 
running on the original hardware. I’ve taken 


advantage of this on several occasions to move 
Windows from an old laptop to its replacement 
simply by doing a disk-to-disk copy under Linux. 

But this is no longer the case. This spring, 
Microsoft introduced a new authentication 
system with its Microsoft Office XP release. 
Under the new system, Office XP must be acti¬ 
vated using a valid activation key before it can 
become fully functional. During activation, 
Office stores information about your current 
hardware configuration to disk. 

It also uploads, via the Internet or phone line, 
the activation key and hardware configuration 
data to Microsoft’s product registration data¬ 
base. This lets Microsoft catch people who try 
to install the software on more machines than 
their license allows. The base license allows 
three activations of Office: one for a desktop 
machine, one for a laptop, and one to use in 
case of a system crash. By some reports, the 
OEM-bundled versions of Office allow only one 
activation. 

The kicker here is that Office XP probes the 
current hardware configuration each time you 
run it and will deactivate itself if it detects what 
Microsoft calls a “significant” change in the 
hardware. You have two options at this point. 
You can either reinstall the 
software, using up one of 
the three installs you’re 
allowed, or you can contact 
a Microsoft support center 
and request permission to 
reactivate the software. 

This new authentication 
system is now being 
applied to other Microsoft 
products, including the 
upcoming Windows XP. It’s 
part of a large-scale change 
in Microsoft’s licensing 
policies, which are moving 
away from the traditional 
software retail business 
model toward a model in 
which end users lease 
software via subscription. 

This is part of Microsoft’s 
long-range .Net strategy, 
which will replace inte¬ 
grated software packages 
like Office with a set of 
pay-per-use modules down¬ 
loaded from the Internet. 

The Bottom Line 

No doubt Microsoft sees 
this new way of authorizing 


and leasing software as good for the bottom 
line. End users, who must now ask Microsoft for 
permission to upgrade their hardware, may find 
it outrageous. But what can they do about it? 

Monopoly power means that Microsoft can 
change licensing agreements, upgrade mecha¬ 
nisms, and support policies. It can shake down 
cancer researchers, gather information about a 
user's hardware configuration, and limit how 
much a user can upgrade his or her computer. 

Microsoft can do this because its customers 
have no alternative when it comes to desktop 
operating systems and productivity applica¬ 
tions. No matter how high the price goes, or 
how unbalanced the licensing terms become, it 
won't drive its customers into the arms of the 
competition, because there is none. 

By the way, as this article was going to press, 
Microsoft announced that it had withdrawn 
support for Java in Windows XP. Anyone who 
says that Microsoft’s monopoly power doesn’t 
hurt the consumer is kidding themselves. >< 

Lincoln is on M.D. and Ph.D. who designs informa¬ 
tion systems for the human genome project at 
Cold Spring Harbor Laboratory in New York. You 
can email him at lstein(a)cshi.org. 
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LEGAL CODE 


Bret A. 
Fausett 



Will the recent Supreme Court decision leave a hole in electronic 
history, or just force publishers to do the right thing? 


Bringing Old Content to a New 
Medium _ 


David S. Whitford is a freelance author. Between 1990 

and 1992, when the Internet was primarily an academic 
tool, Mr. Whitford contributed several articles to The New 
York Time s, Newsday, and Sports /f/ustrated. Each of his 
articles appeared, as he certainly had hoped and expected, 
in the print editions of those publications. And if that 
were the only place his articles had ever appeared, I 
wouldn't be writing this column. 

However, Mr. Whitford's articles also appeared in a propri¬ 
etary online service called Lexis-Nexis, as well as the 
University Microfilm, Inc. (UMl) product “The New York Times 
OnDisc," a CD-ROM compilation of back issues of The New 
York Times. With each of the Whitford articles, the original 
publisher had licensed its right in the articles to an online or 
electronic distributor and recouped a licensing fee in return. 

While these new media licensing deals provided yet 
another outlet for his work, and certainly let the articles 
reach a wider audience over a longer period of time, Mr. 
Whitford wasn’t amused. He had contributed his articles for 
print publication only. These secondary publications—both 
in online and CD-ROM format-he argued, were a violation 
of his copyright. 

So in 1993, with five like-minded authors who had con¬ 
tributed to the same publications and whose work had 
suffered a similar fate, Mr. Whitford sued The New York 
Times Company, Newsday Inc., and Time Inc. (publisher of 
Sports illustrated) for copyright infringement. The authors 
sought an injunction prohibiting further publication of 
their work as well as monetary damages. 

The verdict in the case, Tasini v. The New York Times 
Company, would provide critical guidance on a key legal 
question faced by many online publishers-what did they 
already have the rights to publish online, and what rights 
would they have to renegotiate or acquire? 

The stakes were high for The New York Times Company 
and its fellow print publishers. 

Bringing a Business to the Internet 

Few businesses have larger catalogues of content to bring 
to the Internet than traditional newspaper and magazine 
publishers. These companies generate more content in a 
single issue than Matt Drudge writes in a year. Much of that 
content is penned by staff writers, and thus owned by the 
publishers to use in any way they want. However, a signifi¬ 
cant portion of the material published everyday is the work 
of contracted freelance authors. 

Freelance author contracts likely differ from author to 
author. For example, you’d have to review historic contracts 
(negotiated before the Internet existed or before Internet 
use was reasonably widespread) on an individual basis to 


determine whether your company had purchased online or 
electronic publication rights. 

The cost of losing the lawsuit for The New York Times 
Company was not only the cost of the new licensing fee 
that would have to be paid to the freelance authors, but 
also the costs of reviewing the historic contracts and nego¬ 
tiating new ones. Most importantly perhaps, this would 
increase print publications’ time to market because con¬ 
tacting freelance authors and obtaining the necessary 
rig its is a time consuming and expensive process. 

A brief filed by the three publishers warned that a victory 
for the authors would "punch gaping holes in the electronic 
record of history," as the work of those who had not granted 
online rights to the publishers would have to be removed 
from electronic databases—including the Web. 

The wheels of justice move slowly, and in June 2001, eight 
years after Mr. Whitford filed suit, the U. S. Supreme Court 
held that he and his fellow freelancers were correct-online 
rep ublication of their articles did indeed constitute copy¬ 
right infringement. 

Recycled Content 

By law, authors own a copyright in their articles, but 
publishers own a copyright in the collective work in which 
those articles appear. The newspaper you read this morning 
an d the magazine you're reading right now are collective 
wcrks in which the publisher owns the original copyright. 
Individual articles, however, and even the words in this 
column, are the original copyright of the author. Publishers 
typically obtain the rights to publish copyrighted articles in 
their collective works by paying for them. 

The publisher's copyright in the collective work usually 
includes the right to revise it. A newspaper doesn’t violate 
any author’s copyright when it creates a second edition of 
the paper or republishes the paper in Braille or on micro¬ 
film. And this right to create revisions was the very ground 
or which the The New York Times Company, Newsday, and 
Time based their right to redistribute Mr. Whitford’s articles 
in electronic form. The online versions of the newspapers 
and magazines, they argued, were simply revisions of the 
original works. 

The freelance authors, however, noted that when their 
articles were included within an online database, each arti¬ 
cle was presented within a separate ASCII file, which was 
al ocated only to that article. While the original articles 
probably appeared near advertising and next to other arti¬ 
cles in the same edition, when presented online, they were 
separated from their original context and presented alone. 
An article’s value as an independent piece is actually 
er hanced when it’s presented in this way, especially if it’s 
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found in response to a user's search. This isn’t 
a revision to a collective work, the freelancers 
argued, it’s a reprint of an individual copy¬ 
righted article. 

And by a 7-2 vote, the Supreme Court agreed 
with the freelancers. The decision held that the 
act of placing the original document into a new 
ASCII file designed for inclusion in an online 
database or CD-ROM product constituted copy¬ 
right infringement under the U.S. Copyright Act. 

While this case has particular relevance to 
newspaper and magazine publishers, it also 
underscores an important question that any 
brick-and-mortar business looking to market 
its existing properties on the Internet should 
ask itself: Do you own the Internet rights to 
your goods, services, or licensed intellectual 
property? If you don’t, it’s time to prepare for 
the transition. 

Internet Rights 

If your company has a significant back- 
catalogue of publications, this Supreme Court 
ruling will greatly impact your ability to dis¬ 
tribute historic content on the Internet. You 
must purchase from the original author, Inter¬ 
net rights to any content licensed from a third- 
party, such as a freelance journalist. 

There are no magic words that outline 
Internet rights, and each company words its 
contracts differently when licensing intellec¬ 
tual property. However, as the publisher, your 
contracts should give you the right to present, 
deliver, or distribute the property in question 
over any electronic medium, including the 
Internet, cable or wireless systems, proprietary 
dial-up networks, or any other subsequently- 
developed medium for delivering content digi¬ 
tally. When creating new contracts, keep the 
contractual language broad. If your company 
failed to purchase these rights the first time 
out, you don’t want to be surprised again when 
a new medium is developed. 

For several years, publishers have been aware 
of the need to secure Internet rights to works 
they license first for use in other media. They 
have specifically asked for those rights in their 
contracts with authors or other creators of 
intellectual property. The Supreme Court even 
noted this fact in its opinion in the Tasini 
vs. The New York Times Company case. Since 
1995, The New York Times Company has con¬ 
tractually required all of its authors to transfer 
electronic publication rights to their works. 

But prior to the time when it began securing 
these rights as a matter of course, few of its 
contracts may have contained any mention of 
Internet rights. Hence, decades of The New York 


Times content can no longer be distributed 
online unless the magazine painstakingly rene¬ 
gotiates with all of its original authors. 

Music, Video, and Film 

While aspects of this ruling make it especially 
pertinent to publishers, all intellectual property 
distributors are affected. Few companies were 
prepared for the rapid emergence of the Inter¬ 
net, which explains why few licensing contracts 
issued before the last couple of years have 
included the rights to publish intellectual prop¬ 
erty on the Internet. 

This enormous problem has gone largely 
unreported in the stories about the Internet 
revolution. And when it has been reported, it has 
given most media organizations a bad rap, at 
least in terms of how they’re perceived on the 
Internet. While Napster was signing up millions 
of users each month during its heyday, the 
media often portrayed member companies of 
the Recording Industry Association of America 
as bumbling luddites, unable to conceive—much 
less execute—a competent Internet strategy. 
Netizens often blamed these companies’ pro¬ 
pensity to file suit against Internet distribution 
channels as a knee jerk substitute for an inde¬ 
pendent vision of how to serve the Internet 
community. 

Other companies suffered under the same 
misconceptions. If you believed the Internet 
gospel, the movie industry was paralyzed by 
the specters of Scour and Gnutella; television 
networks were stunned by the likes of iTV; the 
publishing industry wanted to lock down words 
onto secure packages; and the comic book, 
game, and toy industries feared a future in 
which they would be killed by Flash animation 
and Quake-like third-person shooters before 
they could bring their own products to the 
Internet. 

The media portrayal of these brick-and- 
mortar industries has often been savage, char¬ 
acterizing companies built on creative works as 
utterly lacking the imagination necessary to 
deliver their content over the Internet. 

Whatever truth lies in those characteriza¬ 
tions, one important consideration has been 
overlooked. These companies probably didn't 
even own the Internet rights to the works they 
were criticized for not making available over the 
Internet. No matter how wonderful the technol¬ 
ogy, you can't deliver what you don't own. 

New Burdens, New 
Opportunities 

Depending on where you sit, this recent 
Supreme Court ruling is either a burden or an 


exciting opportunity. For companies that 
wanted to deliver decade , of content from 
traditional media over th ? Internet, the road 
ahead is filled with lawyers, new contract 
negotiations, and new payments to authors of 
older material. 

For others, however, this may truly be a 
golden opportunity. Simply reading the 
Court's opinion should h we some companies 
drooling at the possibilit es. Based on the 
findings in this new case rlone, there are now 
decades of freelance content that was origi¬ 
nally published in The New York Times, 
Newsday, and Sports Ulust'oted that’s up for 
grabs on the Internet. Tf ase companies don’t 
own it. 

Now make no mistake, since the verdict was 
released, The New York Times and other publish¬ 
ers have probably been painstakingly seeking 
out the creators of their previously licensed 
property and obtaining new rights for the 
Internet. But that’s a very long road, and it's 
doubtful that any of them will ever complete 
that task. 

Don't Be Fooled Again 

The fact that people are : ghting over Internet 
rights and debating whether they’re included 
in ten-year-old contracts that make no men¬ 
tion of the Internet illustrates how truly revo¬ 
lutionary this medium is. Few companies saw 
it coming. 

Even many contracts r ^gotiated during the 
computer age complete!) disregarded the idea 
of digital delivery. Compr, Ties may have 
secured rights to deliver) on diskette, CD-ROM, 
or some other storage media, but few included 
the idea of delivery over .1 wire in their licens¬ 
ing agreements. Who knev,? 

As a result, purchaser? of creative rights are 
becoming much more savvy about the uses to 
which content can be put. They’re now seeking 
and receiving broader rigfits than ever before. 
I’ve even seen contracts :F at request rights to 
distribute content by any means that’s cur¬ 
rently available or that may be invented in the 
future. 

Tasini v. The New York Hmes Company was 
unquestionably a victor) for independent 
creators, and may be a hard pill for publishers 
to swallow. But in the future, the tables may 
be turned, because the p iblishers won’t be 
fooled again. >< 


Bret is an intellectual property and Internet attor¬ 
ney. and a partner with Hancock, Rothert & 
Bunshoft. You can reach him at bret@/extext.com. 


actober 2001 wiww.webt9chniques.com 13 




EUREKA! CODERNAUTS DISCOVER THAT 
SOFTWARE is INFRASTRUCTURE. 

WEBSPHERE SOFTWARE: THE FASTEST-GROWING E-BUSINESS PLATFORM 


business software 


ibm.com/websphere/fastest 


IT’S A DIFFERENT KIND OF WORLD. 

YOU NEED A DIFFERENT KIND OF SOFTWARE. 


m “ 




























strategy 

1; L-j ;i:o| 

Andrew Chak _ __ 


1 

1 m 




m §| A H 
ft 


HL Jf m ,|g 


The site has grown too big, too fast, and they hired you to fix it. So 

where do you start? There are techniques and people who can help you 
become a better information architect. You're about to learn the tech¬ 
niques; your users are the people who can help you. Through techniques 
such as personas, card sorting, and pen and paper testing you stay dose 
to your users and should have a good idea of how to design for them. 

The Definition Phase 

Effective information architecture starts with defining your site’s goals 
and its target audience. Define your site goals by what you want your 
users to do. Do you want them to read content? Buy something? Regis¬ 
ter? Apply for an account? Define your goals in terms of specific user 
tasks, and be sure to include quality of user experience. The statement, 
“Users will be able to apply for a mortgage online” is too ambiguous. A 
more useful, measurable benchmark to aim for would be, “Users will be 
able to apply for a mortgage online without assistance and within ten 
minutes”. Also, learn to prioritize your goals sequentially {i, 2, 3...). 
Otherwise, you’ll end up with a long list of “highly critical” objectives. If 
your team is having difficulty prioritizing, give everyone a fictional 
budget of $ioo and ask them to allocate funds across the goals. 

Now, for your users. To better understand them, start by reviewing 
emails, letters, or comments that users or customers have already 
provided. This will identify their interests as well as areas for further 
investigation. Next, talk to them. Interview them about their needs, 
wants, and concerns. If you don’t have direct access to your users, find 


online 

resources 

for your information 

If you’re still 

Argus Center for Information Architecture 

scratching your 

www.argus-acia.com' 

head, here are a few 
helpful resources. 

InfoDesign 

www.bogieIand.com/infodesign/index.htm 
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people who do. Sales people, account managers, call center staff- 
anyone who comes in direct contact with your users. Also, note what 
users are saying about you in newsgroups. Pay attention to their rants 
and raves and the terminology they use. For example, go to 
groups.google.com and do a search on your site’s topic to see what 
users are discussing. 

Your Users 

Once you know your users, the next step is to define them. Don’t fall 
into the trap of designing for the average user, because this will lead to 
a generic, compromised design. If I asked you to design a T-shirt for an 
average user, you might give me a white T-shirt that fits everyone, 
offends no one, but ends up being used as underwear. If I ask you to 
design a T-shirt for a skateboarder in Van Nuys, California, you now have 
a specific user in mind. You can now decide which color and graphics 
yield a design that will engage the target user. 

Define your users through personas-fictional characterizations of 
your target end users. Personas help you to keep specific users in mind 
while you design. Start by creating personas for each of the different 
types of people who use your site. For example, an insurance site may 
have personas representing heads of families, singles, and widowers. 

The more realistic your personas are, the more useful they are. Give 
them specific names and even clip out photos to represent them. Per¬ 
sonas should also include demographics, interests, needs, and concerns. 
Once you’ve finished, select a target persona. The criterion for selecting 
the target persona is simple: pick the hardest user to please. 

Here's an example persona for the head of a family using 
an insurance site: Victor Lehman is 37 years old and he and 
his wife, Serene, had their first baby a month ago. After 
some parental leave, Victor is back at work as a production 
engineer at a technology manufacturing plant. Victor is 
enjoying fatherhood; however, he does miss having a full 
night’s rest and being able to play golf on weekends. Victor 
is experienced with computers, but only uses the Internet 
when he needs something. And he’s still hesitant to make 
online purchases due to privacy and security concerns. 
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EFFECTIVE INFO ARCHITECTURE 


After you’ve selected your persona, make that person the focal point 
of your design discussions. Instead of having endless arguments over 
what the user wants, your team should now make its decisions based 
on what Victor wants. This will help you and your team to be more 
focused and effective in making design decisions. 

Site Content and Functionality 

The next step is to specify the content and functionality that will 
support your business and user goals. To do this, marry business and 
user goals with ideas for proposed site content and functionality: 

Business Goal: Enable users to apply for life insurance online without 
assistance within 15 minutes. 

User Goal: “Help me find and apply for the right type of life insurance 
with the right amount of coverage to ensure that my wife and child will 
be provided for if anything happens to me.” 

Site Content and Functionality: 

* Explanation of different types of life insurance, 

* Life insurance needs calculator. 

* Insurance application form. 

■ Application form help. 

* Security and privacy policy. 

Scenarios 

Scenarios describe how your site content and functionality link to one 
another; they don’t include design specifics. Scenarios should focus on 
the user’s task and should never specify any user interface details. Inte¬ 
grate your target persona to keep scenarios realistic and user focused. 

A sample scenario of Victor applying online for life insurance might 
be something like this: Victor has a few minutes at the end of his 
lunch break, so he decides to surf the Internet to find out more about 
getting life insurance now that he’s a new father. He has never bought 
life insurance on his own before, so he reads up on the different life 
insurance options available. He also uses a quick calculator to find out 
how much insurance he needs to provide for his family should 
anything happen to him. The calculator leads him to an online insur¬ 
ance form that guides him through the application process. Helpful 
explanations for each step on the form make it easy for him to fill it 
out. He also looks at the security and privacy guarantee to reassure 
himself that his online transaction will be protected. He submits his 
information and feels good about taking care of this necessity in such 
a short time. 

Bottom Up Design 

I’m a strong believer in a bottom-up site design approach, because 
users experience your site on a page-by-page basis. Users shouldn't 
notice your navigation system or your section and subsection page 
templates, nor do they care to. However, designers and architects often 
design from the top-down and hope that content and functionality will 
fall into place and fit in the templates. This might make for some clean 
overall design, but it doesn’t do as much for users who are on a particu¬ 
lar page and trying to figure out where to go next. 

Your site design focus should be on the user tasks outlined in your 
scenarios. Start with tasks that support your highest priority business 
goals. After you’ve designed individual tasks, you can then focus on an 
overall structure that links everything together. 


Wireframes. The first step in the design process is to create a sketch of 
what your screens look like through a wireframe. Wireframes provide a 
rough page layout and can elaborate on the page content. A series of 
wireframes also illustrates the screen flow of a particular scenario. 

To see what the webtechniques.com home page would look like as a 
wireframe, see Figure 1. Now, you might be asking, “Why bother? Why 
can’t I do mock-ups in Photoshop or go straight to HTML?” But creating 
a wireframe helps you focus on how your site works and reads, not on 
how it looks. Once you start using Photoshop or HTML, it’s easy to 
become distracted by the visual design and lose sight of the content 
and functionality that will drive your site’s user experience. 

There are different techniques for creating wireframes, such as pen 
and paper, or any number of drawing or presentation packages (Visio, 
Adobe Illustrator, PowerPoint). It doesn’t matter what you use to create 
wireframes, provided you remember two things. 

First, wireframes should be quick and easy to change. They’re meant 
to provide you with a simple way to move through multiple design iter¬ 
ations. The more you iterate, the better your designs will be. If it takes 
too long to create your wireframes, then you’re probably over-designing 
them, which brings me to the second point. 

Wireframes shouldn't look like designs. They aren’t art-they should 
be plain, simple, and functional. If you’ve given them beveled edges and 
are using the latest Photoshop filter, you’ve gone too far. I also make my 
wireframes in grayscale only, so that visual design doesn’t become a 
distraction. Consider using PowerPoint as a tool for creating wireframes. 
The limited drawing tools help assure that your wireframes aren't over- 
designed, and it lets you present screens in a linear format to describe 
scenarios. It’s also convenient for electronic distribution and for sharing 
notes among team members. 


Navigation Map. While a wireframe tells you what goes on the screens, 
a navigation nr ap is a visual representation of how the screens are 
linked together. These document the path variations for navigating 
between scree is and they provide a means to check the consistency of 
your interaction design. Navigation maps are also quite useful as visual 
checklists of a I of the pages for which you need to design and create 
content. Figure 2 shows a simplified example of a shopping cart 
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figure 1 


WebTechniques.com as a wireframe. Wireframes 
help you focus on how your site reads. 
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navigation map. Currently, there is no standard visual nomenclature for 
drawing navigation maps. HoweverJjg.net (www.jjg.net/ia) provides a 
good set of symbols from which you can start 

Content development. Form and function go together; so should 
content and architecture. If yours is a functional site, collaborate with 
your editor to decide what will appear on each page to guide users 
through the process. If you have a content oriented site, your informa¬ 
tion architecture should always guide users to the next article or other 
areas of interest. 

Site Architecture 

So it’s finally time to design the overall architecture of your site. 

Site structure. Once you’ve prepared the individual elements, you need 
a site structure that supports your ideal user experience. Specifying your 
site structure is like designing the aisles in a supermarket. Keep in mind 
that different users may think about structure differently. For example, 
some people may look for the Chinese egg noodles in the pasta aisle 
while others will head to the international aisle. To find out what 
preconceived ideas users may have about your site structure, use a tech¬ 
nique called card sorting. 

Write down the names of your site’s content and functionality 
components on separate cards and ask your users to sort them into 
groups. The results give you a sense of how they expect the site to be 
structured. Recruit about six people who are representative of your 
site’s various personas. This will help you see whether there are any 
discrepancies in the ways different types of users would structure your 
site. Take special note of content or functionality names that confuse 
your users, and consider renaming them in your final design. Ask your 
users to explain why they grouped cards the way they did, then ask 
them to suggest a name for each group. After you’re done, look for 
common groupings and use them as input for your own card sorting 
exercise to finalize the site structure. 

Labeling. Once your structure is solidified, decide on labels for your 
sections and subsections. This is an extremely important task and 
shouldn’t be taken lightly. If you pick the right labels, your users will 
effortlessly navigate through your site. Pick the wrong labels and your 
users may never find what they’re looking for. 

Don’t be creative with your labels, they should be simple, to the 
point, and make sense to your target users. A quick way to test your 
labels is to ask your users to guess which of the main sections contain 
certain components of content or functionality. If they don’t know 
where to find something, ask them why and ask them to suggest a label 
that would make sense to them. Always keep in mind that your section 
labels should be there to help users find what they’re looking for. 

Site Map. The site map is similar to the navigational map except that it 
focuses on overall site structure. Like the navigational map. this is a 
visual check and balance for maintaining site consistency, but it also 
serves as a reference point for site structure and labels. 

Site Level Wireframes. In the same way that you created wireframes 
for specific site content and functions, you also need to create them for 
site-level elements. Site-wide elements include: home page, navigation 
system, main section pages, subsection pages, content page template, 
utility bar, and search. 
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figure 2 


A simplified shopping cart navigation map. 


A Final Word 

Developing your skills as an information architect involves constant 
practice. The more you listen to users and observe them using your 
designs, the better you’ll be at designing for them. To be a good infor¬ 
mation architect, you must be able to look at your designs not from 
your own perspective but from your users’. The more you see like them, 
the better you’ll be. >< 


Andrew is the principal consultant of user experience design ot Derivion, 
where he’s responsib/e for the usability of electronic bill presentment 
and payment products. You can reach him at 
andrewchak(a)hotmaiLcom. 
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Cure for Plug-in Heartbreak 

What seemed unthinkable during the great 

4.x browser wars will soon be upon us: 

Browsers finally support a duster of World 
Wide Web Consortium (W3C) recommenda¬ 
tions. which were finalized late in the twenti¬ 
eth century. Developers will find it easier to 
work with standards like HTML 4. XHTML, 
Cascading Style Sheets (CSS), and Document 
Object Model (DOM). 

Also included in the mix is ECMAScript, a 
standardized, core JavaScript specification. 

Most browsers’ JavaScript implementations are 
already ECMAScript compliant. This ubiquity 
makes it the default language for accessing 
and manipulating page elements via the DOM. 
Dynamic HTML, or DHTML, is the common 
rubric for fine-grained control over page 
appearance and function. 

Prior generations of browsers featured 
incompatible DHTML implementations of 
varying quality. (JSS, anyone?) Those brave 
enough to use these features often found 
themselves writing multiple versions of a 
page to support different browsers, or to 
support only one browser type, letting users 
of other browsers eat cake. This impasse 
provided an opening for proprietary technolo¬ 
gies implemented as browser plug-ins to posi¬ 
tion themselves as solutions for enhanced 
Web content. While plug-ins serve a valuable 
purpose, particularly for multimedia, content 
developed for them is not universally 
viewable. Platform dependencies and content 
encased in proprietary technology crept onto 
the Web in an ugly way. Worse yet, these tech¬ 
nologies were often used to create solutions 
that could have been handled by DHTML. In 


particular, certain uses of Java applets and PDFs 
come to mind. 

But now that all major players are pledging 
fealty to the W3C standards, the Holy Grail of 
writing DHTML code once for all clients may 
be within reach, ending an era of balkaniza¬ 
tion and non-adoption. This convergence is 
occurring just in time; unlike their predeces¬ 
sors, many of the browsers covered in this 
article will be embedded in the future wave 
of Internet appliances and other fixed-func¬ 
tion networked devices. As slowly as PC users 
upgrade their Internet software, Internet 
appliance software looks to be less upgrad¬ 
able and even more permanent. As a result, 
developers may be forced to to target these 
browser platforms well into the foreseeable 
future. Fortunately, the browsers that I review 
below hew closely enough to the standards 
to constitute a development platform for 
dynamic, yet browser-universal, Web 
development. 

It’s now safe for many of the dynamic 
elements in a Web application to migrate from 
the server side to the client side. Web sites and 
applications adopting DHTML can enjoy the 
widest possible audience and a richer and more 
responsive user experience, while seeing less 
client/server network traffic. This is a envelop¬ 
ment that can spark great leaps in Web usabil¬ 
ity, and perhaps Web business as well. 

Internet Explorer 6 

Internet Explorer 5.x for Windows earned a 
mixed report card for standards compliance. 
While it attempted to incorporate support for 
the W3C standards, it fell short in many areas 
of correctness and completeness, like the infa¬ 
mous CSS box model bug. However, it’s stiil a 
force to be reckoned with because of its 
substantial installed base. It was the p edomi- 
nant browser shipped when worldwide PC sales 
were at their peak, and it continues to ship 
with new PCs and ISP connection kits. 


online 

resources 

Smart developers 
know their browsers. 


browser madnes 


Internet Explorer 6 

www.microsoft.com/windows/ie 

Opera 5 

www.opera.com 

Mozilla/Netscape 6 

www.mozilla.org 



Konqueror 2.1 

www.konqueror.org 

Amaya 5.1 

w ww.wjc .org/a maya 


Although its successor, IE6, is notorious for 
its controversial Smart Tags technology, IE6 also 
has a highly compliant rendering engine. This 
engine is notable for its compliance with HTML 
4, CSS Level 1 (CSSi), XML, and DOM Level 1 
(DOMi). IE6 can render a page in two different 
modes; strict, which is compliant with stan¬ 
dards, and quirky, which is compatible with 
IE5-x’s idiosyncratic implementation of those 
standards. Certain declarations in the DOCTYPE 
tag cause a page to be rendered in strict mode, 
while others (or omitting DOCTYPE entirely) 
cause a page to be rendered in the quirky, IE5.X 
compatible mode. In this way, IE6 gives devel¬ 
opers and projects with targeted IE 5.x the best 
of both worlds: compatibility both backward 
and forward. 

Regardless of any suspicions you might have 
about Microsoft using Internet Explorer to 
advance an overall agenda, 1 E 6 is a good citizen 
on the Web, and supports the same set of stan¬ 
dards as its competitors. 

The version numbering isn't consistent 
between IE for Windows and IE for Macintosh. 
IE5 for Mac features the same strict/quirky 
rendering modes as IE6 for Windows, and has 
been available longer. Both versions are down¬ 
loadable at no cost. 

Opera 5 

Opera was an early leader in HTML 4 and CSS 
compliance. The current version continues this 
tradition with its support for XML and nearly all 
of CSS2. Opera’s DOM implementation isn’t yet 
Level r compliant, but it has promised Level 2 
compliance. 

Vexingly, Opera has diverged from certain 
specifications, particularly in cases where 
developers felt that their existing implementa¬ 
tion was more elegant. This is an act of hubris 
for a company with a browser that has a rela¬ 
tively limited installed base. As various free, 
standards-compliant competitors have 
emerged, Opera has become more notable 
because of the manifold platforms for which 
it’s available (including OS/2, Solaris, and a 
variety of embedded platforms). It also 
provides flexible and powerful features for the 
end user. Opera’s ability to masquerade as 
other browsers, such as IE and Netscape, on 
Web sites lets you enter sites that turn away 
users based on the browser they’re using. 

A proprietary product. Opera offers a freely 
downloadable version with banner ads, and a 
standard version that retails for $39. While you 
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might question the sanity of anyone charging 
for a browser nowadays, Opera’s plot seems to 
hinge on volume licensing for embedded appli¬ 
ances. The revenue it generates from browser 
connoisseurs, OS/2 ragamuffins, and assorted 
other PC users seems slight. 

Mozilla/Netscape 6 

The Mozilla project is an open-source effort 
sponsored by AOL/Netscape to create a succes¬ 
sor to the venerable Netscape Communicator 
4.x series. The early decision to create a new 
layout engine, called Gecko, from scratch was 
among the first of many design decisions that 
contributed to the widespread panic about 
Mozilla’s development pace. Mozilla’s decision 
to abandon the much-loathed 4.x layout 
engine broke backward compatibility with 
pages and applications developed for Netscape 
4.x, However, moving forward with Gecko has 
positioned Mozilla as a leading exponent of 
standards compliance and as a reference plat¬ 
form for developing W^C-friendly DHTML 
Gecko is also used in other open-source appli¬ 
cations, such as the Galeon browser and the 
Nautilus file manager. As a result, those appli¬ 
cations are virtually identical to Mozilla/ 
Netscape in their rendering behavior. 

Mozilla is still approaching version 1.0, but 
it’s already quite stable and usable. Netscape 
6.0 is based on an earlier release of Mozilla, 
while Netscape 6.1 is based on more current 
pre-1.0 Mozilla code. The Mozilla community 
has contributed browser ports to a plethora of 
platforms, such as OpenVMS, OS/2, BeOS, and 
various flavors of Unix, in addition to the 
primary development targets of Windows, 

Linux on Intel, and Macintosh. 

Konqueror 2.1 

From its humble beginnings as kfm, the file 
manager for the Unix-based K Desktop Envi¬ 
ronment (KDE), Konqueror has become a full 
featured Web browser. Because Konqueror is 
a collaborative open-source development 
project, W3C standards compliance has 
always been a stated goal. The khtml library 
handles all of its rendering, so Konqueror’s 
page display ability, as well as its support for 
HTML 4, CSS, DOM, and ECMAScript, is avail¬ 
able to other KDE applications. Similarly, 
Gecko can be incorporated into applications 
other than Mozilla. 

You can even configure Konqueror to use 
Gecko in place of of khtml. In addition to KDE, 
Konqueror has been ported to Ot/Embedded, a 
proprietary widget library for embedded Linux 
systems that bypasses the X Window System. 


Amaya 5.1 

Amaya is the W3C's own browser and WYSI¬ 
WYG editor. It has been under continuous 
development since 1997; hence, it’s primarily a 
demonstration project. Amaya implements 
various W3C technologies like XHTML, CSS, 
MathML, Scalable Vector Graphics {SVG), and 
Annotea, a project building an open infra¬ 
structure for attaching and sharing Web page 
annotations. SVG is a logical addition to the 
W3C portfolio. Unlike raster-based GIF and 
JPEG images, you can view SVG images regard¬ 
less of screen real estate, and DOM can 
dynamically control them. 

Amaya's interface attempts to break the 
functional barrier between page viewing and 
editing: view a page, then start editing it in the 
same window. Amaya isn’t intended to be a 
general-purpose browser. It lacks many of the 
convenience features of conventional browsers, 
such as bookmarks or the ability to assume 
“http://” when the user types “www.foo.bar." 
The keyboard shortcuts are nonsensical to 
anyone other than Emacs fans. Amaya also 
lacks a JavaScript interpreter. As a result, DOM 
isn’t supported either—a major deficiency 
when you’re comparing this browser to the rest 
of the group. Amaya is distributed under the 
BSD license, so its source code may already 
have been harvested for use in other browsers, 
both proprietary and open source. 

Are We There Yet? 

Every vendor’s documentation on standards 
compliance is loaded with qualified phrases, 
such as “almost full support for” or “nearly 
complete implementation of.” But vendors 
have made an effort to support the most 
useful and popular portions of the standards 
first. Elements that aren’t yet implemented are 
still parsed correctly and politely ignored. The 
result may rot be perfect, but your pages 
won’t be seriously broken. 

If you’re developing for an intranet or any 
other environment with control over browser 
clients, start lobbying for migration to one of 
these browsers soon. And if maximum 


accessibility (a particular requirement for 
government Web sites) is one of your goals, 
adherence to the W3C standards helps you 
ensure that pages are still readable by alterna¬ 
tive browsing devices: text-based, text-to-voice, 
Braille, small screens, and so on—not just the 
800x600 pixel mainstream. 

If you’re developing a public site, especially 
a commercial one, askyourself the familiar 
questions about where you’re willing to draw 
the line and make the trade-off between the 
latest, most elegant code and the site’s avail¬ 
ability to legacy browser users. If your site is 
connected to a customer relationship manage¬ 
ment system, can the system tell you which 
browsers your best customers use? At this 
point, big spenders online are often early- 
adopter types who want to browse closer to 
the leading edge. 

Judicious use of the W3C standards gives 
you clean, future-proof code that still displays 
acceptably on legacy browsers. Is this good 
enough for your 3.x browser users? How about 
4.x users? Where you draw that line determines 
which cool features you can use. You may still 
be managing your page layout with tables so 
that you can accommodate your users with 4.x 
browsers, for example. 

However far you can currently commit your 
site, allegiance to the APi constituted by W3C 
standards is the one true path for the time 
being. If you design your applications to rely 
on standards-compliant DHTML instead of 
browser plug-ins, you'll attract a more 
universal audience, and ensure machine read¬ 
ability and searchability. Doing this also 
protects you from the caprices of software 
companies. Plug-in-dependent content has 
its time and place, but it pays to re-examine 
the state of good old HTML and friends, 
and the browser Class of 2001-the first in a 
long time to get it right. 

-Charlie Cho 

Charlie Cho is an independent consultant who 
lives in San Jose, CA. You can email him at 
charliefcDcheaux.com. 
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Flash gives designers everything they need: control, a GUI interface, and 
a direct link to the creative process. You can’t say that about markup. 




Flashers Unite 


The only successful Web technology that has truly been 

developed with designers in mind is Flash. Or so I once 
wrote, only to find a virtual ton of passionate email in my 
inbox the next day. The responses were divided. Many found 
my comment not only controversial, but confusing, so I’d 
like to clarify how I came to this opinion. 

First you have to look at the way contemporary designers 
have been trained. If you came to the Web from a visual 
design background within the last ten years or so, you’re 
probably used to working with visual tools. The argument is 
really as simple as that. Flash is a visual tool. Markup, 
scripting, and other language-based technologies on the 
other hand, have few viable options when it comes to 
entirely visual editors. 

What Designers Want 

When I’m in designer mode, I don’t even bother thinking 
Web. I open Illustrator, Quark, or Photoshop (or all three) 
and use these tools to help me achieve all of the things 
designers want. Illustrator lets me work with complex 
shapes and integrate type into a design. Photoshop is the 
consummate tool for working with graphic elements and 
tapping into the power of history and layers to give my 
work flexibility. Quark lets me refine layout, and work with 
margins, text placement, text flow, and white space. 

Designers will be very familiar with this trio of tools, or a 
similar grouping including FreeHand, PageMaker or 
InDesign, or CorelDraw. Whatever the tools, designers using 
them can create unusual and interesting layouts without 
thinking of the constraints that markup imposes. The act of 
designing with visual tools is at first a creative process-a 
process that’s very different from writing markup. 

Designers want control, whether it’s over font size, 
margin measurements, specific positioning, or real color. If 
you’re a Web designer or a Web developer, you undoubt¬ 
edly know that achieving these things with Web markup 
and related technologies can be very challenging, particu¬ 
larly if you’re designing for a wide audience using a variety 
of browsers. 

The question then, is how to balance your need for the 
control you find in contemporary design tools with effec¬ 
tive markup. 

What Visual Editors Offer (or Don't] 

Because visual designers tend to work in the visual world, 
many have eschewed hand-coding HTML and are relying 
instead on popular visual HTML editors. This isn’t necessari¬ 
ly a bad thing—particularly as certain programs are becom¬ 
ing more sophisticated and inclusive of W3C ideologies. But 
for the time being, using visual editors successfully requires 


you to have both an understanding of markup and the 
design tool in question. 

Even the most sophisticated visual editors demand that 
users have a pretty hefty knowledge base. If you use the 
gr d or visual layout editors in programs such as GoLive or 
Dreamweaver, your markup will be much heavier than if 
you understand something about working with tables and 
use a combination of your knowledge and the visual 
ecitor’s interface to achieve your results. What's more, 
without a deep understanding of contemporary markup 
goals—accessibility, separation of presentation and format¬ 
ting. and globalization—you’ll have more trouble producing 
documents that look good, function fabulously, and adhere 
to W3C recommendations. 

Part of the problem is (say it with me) that HTML was 
never meant to be a design language. We’ve stretched and 
pummeled it into submission. By HTML 4, most Web people 
realized that document formatting and visual presentation 
should be separated. With XHTML, that lesson grew 
stronger—particularly with XHTML 1.1, which was built 
mostly from the XHTML 1.0 strict DTD. 

I dare you to get successful margin control, consistent 
fonts, positioning of visual elements, and application of 
color or background graphics in a strict environment. Right 
now, you can’t do it and be fully interoperable across 
browsers, browser versions, and platforms. 

CSS You Say? 

CSS is another technology created for designers. The concept 
of style sheets is very familiar to many designers, especially 
those who’ve worked a great deal in desktop publishing, 
where styles are often separated from documents. However, 
while the concept makes tremendous sense to a lot of 
designers, its implementation remains difficult. 

When all the world has browsers that can successfully 
support as many features of CSS as possible, we’ll be happy 
campers. Well, mostly. Learning to write sound CSS is a seri¬ 
ous commitment. And as I pointed out, for many visual 
designers, writing anything codelike is distasteful and 
downright problematic. It’s like telling a graphic designer to 
cieate an Illustrator file by writing out the PostScript code. 
Good luck! 

Until some genius develops a fully visual interface that 
writes proper and efficient css, it’s hard to wedge CSS 
into a visual designer’s workflow. That means CSS, while 
built for designers, isn’t yet an effective design tool. 

Scripting 

DOM-related scripting via ECMAScript, JavaScript, and 
DHTML definitely offers an array (if you’ll pardon the pun) of 
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options for designers. This is especially true 
when we want to create interactive events 
from simple mouseover graphics to sophisti¬ 
cated games. 

Though scripting can truly empower a 
designer, it suffers from the same problems as 
CSS and markup. Sure, you can grab free scripts 
ail over the place, but that’s a haphazard way 
to build a professional application, especially if 
you don’t understand enough about a given 
script language to troubleshoot problems. And, 
the same interoperability issues exist, espe¬ 
cially with DHTML. Then remember that some 
site visitors turn off JavaScript, and it’s tot 
hard to understand why most scripts aren't 
built for designers. 

Emerging Design Technologies 

But what of technologies like Scalable Vector 
Graphics (SVG) and SMIL? Aren’t these geared 
toward designers? Yes, they are, and they’re 
both growing in terms of support and tools. 

SVG is particularly interesting because of its 
scalability, language-based structure, ard— 
most especially in the context of this artide- 
the fact that visual tools are being built for it 
as the technology is designed. Several major 
applications, including CorelDraw, have or are 
adding SVG export filters to their software. 

Batik, an open-source toolkit for SVG develop¬ 
ers, is also a promising solution. (For more on 
SVG, see this issue’s Design section review of 
the SVG editor, Jasc WebDraw.) Designers can 
work in a visual editing environment, and the 
applications generate impressively soph sw¬ 
eated markup. 

There are a few problems with SVG, however, 
the foremost being that it requires a special¬ 
ized plug-in of which most site visitors are 
unaware, or not interested in installing. This 
differs dramatically from Macromedia Flash, 
which most users already have installed as a 
default. And although animations are on the 
horizon in SVG, the technology still needs time 
to grow. 

SMIL's current state of affairs is similar. The 
technology as a whole is exciting for designers, 
because the focus is not only on design 
elements, but also a rich selection of interac¬ 
tive multimedia: text, images, sound, motion. 
All of this is in a development environment 
that’s somewhat accessible, yet demands a 
fairly high learning curve and balance of skills. 

Enter Flash 

From a visual design perspective, Macromedia 
Flash is the only tool that’s currently capable 
of providing a completely visual development 


environment and a wide choice of how to 
employ the technology. I should mention that 
other tools, such as Adobe LiveMotion and 
Beatware e-PicturePro, have GUI environments 
and SWF export. These tools also have a place 
in this discussion, but Flash is more widely 
known and has had more years of improve¬ 
ments, increasing its popularity, usability, and 
technology options. 

As you know, Flash produces and controls 
color, text, layout, shape, and motion. What 
did I say designers want? Control! Well, Flash 
gives it to them, and then delivers the product 
to the Web, 

Reading through that virtual ton of letters, l 
naturally came across quite a few from readers 
who were extremely opposed to Flash. Some 
argued that Flash was inaccessible, inappro¬ 
priate for the Web, and that the plug-in issue 
remains a concern. On the other side of the 
fence are the enthusiastic designers who love 
the visual interface, and the control Flash 
affords, even over motion design. 

Now, am l advocating the complete aban¬ 
donment of markup? Of course not. And, 
despite my often strident opinions on Web 
markup, adherence to recommendations, and 
the separation of document formatting from 
presentation, I still maintain that there’s a 
place for both approaches. 

Competent Web designers, developers, and 
project managers understand their audiences 
and a site’s intent. Ultimately, the appropriate 
form of communication must drive the site. 
These simple issues-not the preference of 
one technology over another—are where the 
trouble lies. 

An academic site with lots of technical 
documents and articles isn't a good place to 
use Flash. A site for a new band, appealing 
to a young, Web-savvy audience is probably a 
great place to use Flash. And of course, you 
can always give people options if you want to 
provide them with the best of both worlds. 

Usability freaks who are so convinced that 
Flash is evil are wrong. Similarly, Flash fanat¬ 
ics who are unwilling to look at other means 
of achieving an end are also wrong. As with all 
sensible solutions, the answer lies not in the 
technology, but in its appropriate use. >< 

An author, instructor, and designer, Molly has 
been honored by Webgrrls as one of the 25 Most 
influential Women on the Web. She has written 
and contributed to numerous books about the 
internet and the Web. You con visit her Web site 
at mvw.mo/fy.com. 
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There’s something interesting happening this year in our industry. 

Despite the downturn, major companies are investing in their Web sites. 

Somewhere back in 1997, large companies decided that centralized 
Web development departments were too slow or too controlling to keep 
up with the rapid innovation that characterized the Web at that time. 
Soon thereafter, every department had a Web content group that oper¬ 
ated more or less independently from others around the company, and 
had free reign to develop content that seemed right for its section. 

Content groups proved both good and bad: On the plus side, lots of 
useful content was created quickly, sites grew and matured at an 
astounding pace, and the Web's value became widely understood within 
these companies. Unfortunately, the sites became sprawling structures 
with unconnected silos of content that provided little continuity, They 
failed to provide a cohesive experience for the site’s visitors and were 
expensive to maintain. 

The current economic downturn has made it more important to opti¬ 
mize than to innovate. Companies are recentralizing management of 
some Web functions, and creating a hybrid process for content creation. 
Departments across the company can still generate content, but the 
holistic user experience—architecture, design standards, and basic func¬ 
tionality—is managed centrally. 

The push for centralization has two primary drivers: operational effi¬ 
ciency and user experience concerns. Content management systems 
(CMSs) improve efficiency by processing all content through a single 
storage and retrieval system. Instead of supporting an array of systems, 
the technical team can focus on maintaining and extending one plat¬ 
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form for the whole company. Companies are addressing user experience 
concerns by reworking their sites; overhauling everything from the navi¬ 
gation to a site’s fundamental organization. 

Despite the temptation to start making changes, proceed with 
caution. Before you even consider a CMS migration or a re-architecture 
project, you’I! need to take a content inventory. These projects affect 
vast amounts of existing content, some of which may be redundant or 
outdated. They’re different from the typical projects that architects and 
developers have faced in years past, and they require new tools. 

The Post-Downturn Architect's Tool 

After years of boom and sprawl, many Web sites resemble L.A. County 
more than an organized system of resources—you'd need a really good 
road map to find your way around. Before an information architect can 
hope to reorganize your site to improve the user experience, someone 
needs to understand it-the scope, nature, and context of all those piles 
of content. In most companies, no one person is familiar with every¬ 
thing that’s there. 

The basic task of re-architecture is answering the question “What 
goes where?” The content inventory answers the “what” part of the 
question, so that you can get to work arranging the “where” using other 
architectural techniques. 

A content inventory is a methodical review of a Web site's content. 

It’s essentially a research project, and the information you glean from 
conducting it is sometimes as important as the deliverable you create at 
the end. There are various kinds of inventories that you can use alone or 
in combination to reach different ends. Three basic types of inventories 
cover most cases: 

A survey is a high-level review of core site pages, usually taken at the 
beginning of a project. Surveys help you understand the scope and 
nature of the material—the type of content, what topics it covers, and 
so on. At the end of a survey, you should have a dear understanding of 
the major chunks of site content. You can use the survey on its own, or 
as a launching point for other inventories. I usually find it helpful to 
structure the survey as a miniature version of a detailed audit. 
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CONTENT INVENTORY 



A detailed audit is a comprehensive, page-by-page site inventory. 

When complete, this audit lists every page by name and URL, assigns it 
a unique number to identify it, and lists major attributes of the page 
that will eventually form part of the important meta data. Often, archi¬ 
tects find it easier to begin the detailed audit by doing a quick survey 
to flesh out a basic framework before beginning the page-by-page site 
review. The completed audit is useful during migration to content 
management systems. 

A content map is a visualization, a simple illustration of the site’s major 
content components. Resist the urge to arrange components by their 
current location within the architecture. Instead, group them to reflect 
the most important user and business objectives. Content maps are the 
most powerful of the three tools for understanding the big picture, and 
they can be derived either from surveys or from detailed audits. 

Quality inventories must be accurate, consistent, and thorough. If you 
take inventory with attention to detail and completeness, the end result 
becomes a solid basis for future architecture and migration work. If 
sections are missing or mishandled, the entire inventory loses credi- 
bility—and this isn’t the sort of task that you want to re-do. 

Setting Up 

Performing surveys and detailed inventories involves essentially two 
steps: Set up your file, and gather the data. The file templates for a 
survey and a detailed audit look virtually identical. They differ only in 
the amount of detail that you record for each page, and the number of 
pages that you review. In short, the survey records some information for 
a sampling of pages, while the detailed inventory records all information 
for all pages. 

You can set up the file in any spreadsheet or database application: 
Excel, Access, FileMaker Pro. I usually use Excel because it’s so widely 
known that I can feel comfortable handing the files off to clients or 
coworkers without worrying about whether they have the application or 
know how to use it. 

In the Excel file, every row corresponds to a page on the site, and 
every column is a piece of information about that page. The data that 
you’ll want to record for each page varies from project to project, but 
there are some good standards with which to start. 

There are three general types of data for each page: identification 
data, such as page title and URL; content data, which describes the page 
type and subject matter; and management data, which may include the 
content owner or producer, and flags for calling attention to stale con¬ 
tent that should be removed from the site. 

While the pertinent information varies according to the needs of your 
project, the following is a basic set of data fields that you can use. 

(These have been adapted from a methodology I learned from my busi¬ 
ness partner, Jesse James Garrett, author of jjg.net.) 

Link ID. In my audits, I give every page on the site a unique ID. It’s a 
minor annoyance, but a major benefit. With the link ID, you can refer¬ 
ence pages with confidence. Referring to pages by URLs, which can be 
quite long, becomes cumbersome. By saying “look at item number 
53.6.1,” everyone can flip to that page in the inventory and be certain 
that you’re talking about the same piece of content. 

To create the IDs, I start by giving every page on the site-wide naviga¬ 
tion its own number. Home, for instance, stands alone at the top level 
of the site. Its number is 1.0. At the next level, you might find About the 


Company, Products, and Customer Service. These would be numbered 
i.i.o, i.2.o, 13.0, respectively. Within the Products section, the 
Applications top page would be 1.1.1.0 and the Service Products top page 
would be 1.1.2.0. If there were five Service Products content pages below 
that, they would be i.t.2.1,1.1.2.2,1.1.23, and so on. 

Pages with subpages get the .0 suffix, while pages without children 
don't. This way I know at a glance whether a given page has subpages. 
This also lets me use the Excel autofill feature to generate page IDs for 
the subpages in that section. To use these, I simply click on the parent 
page ID cell ami drag down the column to fill in the sub-page ID values. 

As you build the inventory, every time you step down a layer in the 
navigational hierarchy you add another dot and digit. Over time, this 
numbering scheme instantly reveals both the breadth and depth of a 
page’s location within the site. In some sections you’ll find that you 
have eight or ten dots (meaning that it’s very deep) and in other 
sections you’ll find digits as high as 15 or 16 (meaning that it’s broad). 

For further graphic representation of the hierarchy, you can use Excel's 
indent feature to inset sub-page IDs. 

Link Name. In most cases you can use either the HTML page title or the 
link text within the <a href > tag to give you the link name. I usually 
find that one is more reliable than the other, depending on the site. 
Some sites use the same page title on multiple pages, but provide 
meaningful names in the actual link tags. No matter where you glean 
the information, your goal is to collect the data in the same way for 
every page. So look it over, make a decision, and stick with it through¬ 
out the project, 

URL. The URL and the link name can often be captured by a so-called 
spider or Web crawler program. These programs can give you a great 
head start on a detailed inventory, but they aren’t a panacea. The goal 
of the inventory is to produce a document that's meaningful to humans 
and represents the perceived architecture. If you use a Web crawler, 
review and edit the results manually, as the Web crawler rarely captures 
URLs in a way that follows the architecture. 

Content Type and Document Type. These two fields describe the con¬ 
tent. Content type isn’t the same as topic—it tells you what kind of 
information it s, not what the information is about. For instance, 
marketing information, data sheets, technical specifications, and 
customer stories are all content types. You must decide on a complete 
set of possible types before you begin a detailed audit. This gives you a 
controlled vocabulary—a fixed set of values from which you can choose 
to fill the field The document type field is similar, tel ling you what kind 
of document you’re dealing with: paragraphs, a list, a form, a white 
paper, and so on. 

By using a controlled vocabulary, you can begin to identify all pages 
of the same tyoe in your site. 

Topic. This field describes what the content is about. This isn’t a stan¬ 
dard values field, but rather an open field that you can fill with any 
words that describe the content topic. 

Management Fields. These are the most open fields, and you can use 
any that help you in your project. In past projects, I’ve used producer, 
content owner, user type (the intended audience), company type 
(customer, partner, and so forth), facets, frequency of update, and 
outdated flag. 
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Cell Format Conventions 

Because consistency is paramount, and repeating work is painful when 
you’re creating a detailed inventory, establish cell-formatting conven¬ 
tions before you begin. As before, this is one reason why it’s best to 
start a detailed inventory by doing a survey. The survey gives you an 
opportunity to quickly review the issues you’ll encounter down the line 
and decide on a workable strategy. 

As I move through the inventory, I mark redundant content and 
cross-links by shading the link name and URL fields a light gray. I often 
use cell formatting tricks with color and indentation to illustrate the 
level under which information falls. For example, I indent the ID and 
title cells of child pages. In addition, I mark top-level links with yellow 
across the whole sheet; second-level with green across the first two 
cells only. Lower-level links I leave plain and indented to indicate their 
level; and I often include bracketed and italicized hierarchy notes in the 
URL field. 

Filling in the Survey 

So now you’re done setting up. Making the decisions about which fields 
to include is half the battle. For surveys, you won’t need to gather all of 
the information you’d probably want in the detailed audit. At least plan 
to capture link ID, page name, URL, content type, page type, and topic. 
Because this is your first review of the site, you won't have established a 
list of values for content type and gauge type yet. Don’t worry, that’s 
partly what the survey is for. 

Browsing the site and filling in information can be tedious, but it 
never fails to inform. You want to follow a broad selection of links to 
capture information about the major site sections. Look at the top 
pages and a variety of content pages in each section. As you fill in the 
spreadsheet, you’ll be sketching out the major features of the site. 

While it won’t show every page on the site, the completed survey 
should show every major content component. For a large-scale Web site, 
expect to spend about 40 hours on a survey. 

As you work on the survey, you can make a list of values for the fields 
that require controlled vocabulary, including content type and page 
type. When you’ve finished the survey, you’ll have a draft list to circu¬ 
late among the project’s major stakeholders. Together, you can refine 
and edit the list until it’s fairly complete before you begin the detailed 
audit. Most sites have fewer than 25 content types, and fewer than 15 
page types, though these numbers can vary widely. 

Mapping the Content 

Once I’ve finished the survey, I take all of the site’s major content 
components, put each one on a sticky note, and duster them according 
to user and business goals. If you have a dear understanding of these 
goals, this activity is fairly straightforward. This is a good activity to do 
with a small group of clients or co-workers. 

Your cluster groupings can be mapped using Visio, Photoshop, or any 
number of other visualization programs. I show redundancies across 
groups by stacking the boxes and coloring them differently. With a 
three-hour working session and five hours of independent work, you’ll 
have a content map to use as a conceptual reference for architecture 
decisions. Often, this visualization provides a radically different 
perspective on the site than a traditional architecture diagram would 
provide. With a good map, information architects can build stronger 
relationships between content, identify and eliminate duplications, 
and re-envision architecture with a view toward breaking out of 
content silos. 


Full Detailed Audit 

If you’re preparing for migration to a content management system, 
you’ll eventually need to take the framework from the survey and 
perform a detailed audit. Immediately prior to the migration, you 
should spend several weeks following every link on the site. Assembling 
a comprehensive listing of pages makes it possible to track those pages 
in the move to the new system. While this may feel like tedious work, it 
will give you a deep understanding of the site content. The greatest 
benefit of tracking pages this way is that you’ll be able to identify and 
eliminate redundant, outdated, and otherwise ineffective content. The 
detailed audit is a deliverable with a fairly short life span. Once the 
migration is complete, it will no longer be useful, so don’t be concerned 
about updating and maintaining the file in the long term. 

Rethinking Content Structures 

You need to know what you have to work with before you can organize 
it better. The inventory, above all else, helps you get to know the 
content deeply; this is as important to a re-architecture as understand¬ 
ing user goals and business goals. Make associations across groupings, 
identify redundancies, and slice it along a different grain. >< 


Janice is a partner with Adaptive Path, a user experience consulting firm. 
She recently completed a no-hour content audit with more than 8000 page 
records. The happy client has the Content Map (printed in glossy color and 
mounted on foam core) hanging outside the V.P.’s office. You can reach 
Janice at janicefaJadaptivepath.com. 
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Jasc WebDraw 
Version .5b 

jasc Software 
www.jasc.com 

Free beta version. 


SVG Gets an Editor 

Scalable Vector Graphics (SVG) is a graphics 

format based on XML, and is currently nearing 
completion at W3C. Support for SVC isn’t as 
broad as for Flash and Shockwave yet, but you 
can expect that to change with broader indus¬ 
try and browser support. Preparing to grab hold 
of the emerging market, Jasc has released a 
beta version of WebDraw. 

Devil Doll Conversion 

To test WebDraw, I decided to try importing a 
vector file to convert to SVG. (I asked my pal 
Joe Sparks for a vector file of Devil Doll.) I 
thought this would be fun because SVC is 
becoming Adobe’s answer to Macromedia’s 
Flash. WebDraw doesn’t import anything but 
SVG at the present, so first 1 had to import 
Macromedia FreeHand graphics into Adobe 
Illustrator and then export them to SVG. Free- 
Hand and Illustrator don’t have very compati¬ 
ble gradient tools, so i removed the gradients 
before exporting. 

All of Devil Doll’s outlines disappeared when 
I ungrouped him, because Illustrator applied 
outlines to the group and not the individual 
elements. So, I reapplied the outlines and set 
to work re-creating the gradients. WebDraw 
has numerous pre-installed gradients’ however, 
you can also import gradients from Paint Shop 
Pro. Devil Doll only has two simple gradients, 
so I chose to create them in the SVG Source. 

I applied a simple white-to-black gradient to 
the head and opened up the Source view. Then 
I edited the gradient definition by changing 
the name and RGB values of the colors, and 
applied that definition to the head. The final 
Devii Do!!, shown in Figure 1, was a tiny 4KB, 
but it's worth noting that the same graphic as 
a Flash file was only iKB. 

Good Set of Tools 

I like the preset objects tool, which lets you 
place simple arrows, hearts, and other graphics 
into your document from a library. To edit the 
points or nodes of these objects you must 
select the object and choose Convert to Paths. 


There are four primary drawing tools. You 
can use the Line tool to draw straight lines. 
With the Polyline tool you can draw irregular 
polylines or polygons. The FreeHand tool lets 
you draw a path freely, without clicking for 
each point. And finally, the Path tool is similar 
to the default pen found in vector illustration 
tools such as Illustrator and FreeHand, The 
Path tool lets you mouse-down on a point, 
and then drag with the mouse further to 
adjust the curve of the connecting line. When 
you let go, you set the line. 

WebDraw also remembers your last-used 
settings and uses those the next time you 
create a file or use a tool. 

Hand-editing path nodes is common in 
vector-based illustration software, and Jasc 
has added some nice, unique touches to the 
path editing interface. When you select a 
node or point, the editable handles of the 
Spline are drawn with an arrow to indicate 
the path’s direction. When you select a path’s 
end point or start point, the cursor gets a 
little text indicator to let you know this. Nice 
feedback! 

I’d like to have seen both the Source and 
Canvas views at the same time, but the editor 
didn’t let me do this. All the same, switching 
back and forth was much less tedious than 
constantly previewing in a viewer or plug-in- 
enabled Web browser. 

Annoyances 

WebDraw is stili in beta, and it shows in key 
areas. For example, I created some simple 
headline graphics in WebDraw, and each line 
had to be a separate element. Also, WebDraw 
doesn’t allow for type kerning yet. Hence, I 
had to convert the text to a path and manual¬ 
ly move the letters to achieve nicely kerned 
pairs. In another instance, I tried the oevel 
effect on an ellipse. The result was nice, but 
WebDraw gives you little range with which to 
work. I tried the highest bevel setting (5) and 
got a very subtle bevel indeed, I couldn’t 
control the bevel angle, and I found that the 
only color or contrast control offered is a 
single color chip picker. 

Other problems include the fact that 
WebDraw doesn’t yet respect the whitespace in 
code you’ve already written and pasted into the 
Source view, And, the drawing tools are very 
basic in that simple effects can be applied to 
objects/drawings, but you don’t have much 
control over these effects. 



Cons 
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Only opens SVG 
format files; no 
graphical gradient 
editor; limited 
bitmap output. 
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A Flash Killer? 

WebDraw is an excellent start for a promising 
product. In addition to SVG, it also exports 
bitmapped graphics, but you’re limited to 
BMP and JPEG images, with no image quality 
or file-size adjustments. I realize that 
WebDraw isn’t intended for raster graphic 
creation, but with Web design and develop¬ 
ment staffs shrinking, the fewer tools you 
need to achieve the greatest variety of output 
formats, the better. Overall, 1 think Jasc 
WebDraw is an excellent low-cost tool for 



figure 1 


Devil Doll with 
gradient horns and 
head in WebDraw. 


creating simple SVG code. It would be a great 
asset to anyone creating scripts to output SVG 
on the fly, or to build consistent, styled head¬ 
lines on a Web site. It lacks Illustrator’s higher 
end features, but you can’t beat the price 
{free, while in beta)! 

—Lynne Cooney 


Lynne Cooney has been a Web developer and 
designer for over six years. She currently works on 
the sites for MacHome and Radiskull & Devil 
Doll. You can email her at lynne(a)ilynne.com. 
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Animatek World Builder 3.0 

Digital Element 
www.digt-element.com 

$339 Standard; $939 Professional 


Control Your 3D Scenes 

For 3D graphics professionals, Animatek World 

Builder has long been the leader in gD-terrain 
generation for Windows. It's a complete pack¬ 
age, and includes a full collection of tools and 
an excellent rendering engine. The interface 
is very similar to most 3D packages, so it’s 
relatively easy to use. and with its large collec¬ 
tion of plant libraries and new integration with 
3D Studfo Max, World Builder has improved 
even more. 

Contents 

One of the problems with many 3D applica¬ 
tions is a lack of adequate documentation or 
help files—a particular problem for newbies. 
World Builder excels In this area. The small 
booklet that ships with the product contains a 
quick introduction to the package and covers 
everything a new user needs to get started- The 
help files are also one of World Builder's 
strengths, They contain tutorials to guide 
beginner and experienced users, and come with 
explanations for all of the tools and commands 
that World Builder offers. 

User Interface 

As you can see In Figure i T the interface offers a 
customizable system of three view windows. 
These windows are the traditional front, top, 
and left windows with a fourth window called 
the Library window. The Library window dis¬ 
plays a collection of object libraries that let you 
drag and drop objects Into a scene. 

Three smaller windows are displayed on the 
right-hand side of the screen, toolbars, and file 
menu. These windows provide an efficient way 
to work in 3D. With this system, you T re never 
more than two clicks away from accessing the 
tools or other features you need. 

These three windows work together as a sort 
of project manager, with which you can control 
the parameters of the elements in your scenes. 
The top window is called the Object Tree, It 
lists all of the items in a scene, such as lights, 
cameras, and landscapes- The middle window 
changes depending on what you’ve selected In 
the Object Tree, The bottom window displays 
various Interfaces depending on the selections 
in the middle window. 


Creating Scenes 

You can create a 3D terrain in World Builder by 
drawing lines called Skeleton lines. These can 
be drawn in any of the view ports and while 
they're being drawn you can perform several 
actions on them, such as moving nodes. 
Moving nodes lets you move a single point or 
several points on the Skeleton line. You can 
also add or delete points on the lines, dose or 
cut lines, and even add fractal noise to a 
Skeleton fine by introducing points. With 
these tools you can create almost any imagi¬ 
nable type of terraln-you can even import 
real-world DEM files. 

Once you've created the terrain, the next 
step is to add other objects to your scene, 

You can add items such as cameras, textures 
for your scene, flowers, 3D clouds, trees, 
lakes, rivers, or 3D objects from other 
applications. The excel¬ 
lent detail In the 
creation process gives 
you complete control 
over everything you 
create, you can even 
choose how you want 
blades of grass to 
appear. 

Animating objects 
in a scene is also very 
simple. Using the 
animation tools at 
the bottom of the 
window, you can set 
keyframes at various 
animation points. 

Once you have the 
basic positions (for 
Instance, the start 
and end of an arm swinging) and the 
keyframes set, World Builder automatically 
fills rn frames between the two. You can con¬ 
trol virtually anything in a scene to make 
convincing elements. 

Rendering 

Animatek World Builder has always had a stel¬ 
lar rendering engine, and this version contin¬ 
ues that trend, it offers various types of 
rendering options such as bounding box, skele¬ 
ton, wire frame. OpenGL, Draft, Preview, and 
Production Rendering. Depending on your 
needs, the rendering times can be quick for a 
preview, or quite long for a production render¬ 
ing. Note that the Tenderer, unlike some 3D 
software, takes advantage of multiple proces¬ 
sors, which can be a tremendous boost for 
large scenes. 



Additional Considerations 

World Builder works seamlessly with 3D Studio 
Max. The progra m offers a plug-in for this 
purpose, called the World Builder Communica¬ 
tion Plug-In. The plug-in integrates 3D Studio 
Max scene elements with World Builder 
elements, making them appear as though they 
were both rendered at the same time in a 
single program—a process known as Full 3D 
Blending, Amazingly, even the wide array of 


3D Studio Max plug-ins, such as Hair blend 
well with the objects In World Builder. 

If you prefer to use models from 3D Studio 
Max or if you use another modeler you can 
import models from outside applications 
with the 3DS file importer. The built-in help 
files are tremendous, and along with the 
program disc, you receive a bonus CD-ROM 
that contains a wide array of plant libraries. 
The ease of use combined with support for 
3D Studfo Max further enhance Animatek 
World Builder. I expect this package to 
continue leading the way for 3D terrain build¬ 
ing applications, 

-C/ayton Crooks 

C/oyton Crooks /s a freelance writer and inde¬ 
pendent consultant based m KnoxvrNe, TN. You 
con ema/i him at crooksl 3 ipianerc.com. 
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How UPS and the Pacific Stock Exchange Cooked Up Customized Systems 




You've just finished putting together a package for an important new 

client when your local United Parcel Service (UPS) driver arrives. She 
frequently collects packages from your firm, so you spend a few 
minutes chit-chatting while she punches information into a cool, curved 
handheld device. This package is for your biggest client yet, you telf her, 
and you're excited about the potential of future projects. Can you find 
out when the package arrives? 

She waves the little tablet -1 fke device over the package like a magi¬ 
cian, hoists it onto a dolly and tells you to log on to www.ups.com. 

Using the tracking number of the package, you'll be able to find out 
exactly where it is at each stage of the delivery process. Once the client 
signs for the package by running a stylus over the touch screen or the 
UPS delivery driver's handheld tablet, the ticket to your peace of mind 
will be available on the Web in about three seconds. 

Network appliances are no longer a consumer phenomenon, consist¬ 
ing of devices like Palm Pilots that synchronize Information with a PC, or 
microwaves that trade data with refrigerators. Several organizations are 
tapping in to the power of network appliances via customized devices 
that play a key part in their business processes, 

Atlanta-based UPS and San Francisco-based Pacific Stock Exchange 
(PCX) are two of a handful of companies that have already adopted 
network appliances. While PCX only recently developed wireless and 
handheld devices for use on the trading floor, UPS has been deploying 
similar technologies for two decades. 



United Parcel Service 

UPS relies on its mobile device system to keep track of the millions of 
packages it ships each day. Called the Delivery Information Acquisition 
Device (DIAD)* this complex handheld required more than a decade of 
planning, hundreds of millions of dollars in corporate investments, and 
a lot of pitches to executive management to develop and deploy it. 

“The big picture is that we use this cool technology to populate a 
database, which we can then make available to our customers through 
the Web" says Dave Saizman, project manager for DIAD. You may have 
already seen your UPS driver using the device. It's an hourglass-shaped 
tablet with a giow in-the-dark keyboard, programmable "soft keys"- 


large enough to be used comfortably by someone wearing gloves—and 
icons intended to simplify global deployment. 

About 85,000 UPS drivers use DIAD III on an average day. although 
during the last holiday season that number swelled to about 127,000. 
Delivery drivers scan a package's bar code, collect the receiver’s signa¬ 
ture, type the recipient's last name, and push a single key to simultane¬ 
ously complete the transaction and send the data, Because the devices 
are equipped with an Internal packet data radio* the DIAD sends delivery 
information to UPS's data repository as soon as it's entered, 

UPS says DIAD is the only handheld computer in the industry to both 
collect and transmit real-time delivery information at virtually the same 
time, giving customers a package's delivery status while the driver Is 
still at the delivery site. For customers, the DEAD is a key part of their 
ability to receive more timely data on the whereabouts of their pack¬ 
ages. “We get about four million tracking requests per day through the 
Web alone." Saizman says. The system has ultimately enabled the pack¬ 
age shipping service provider to cut costs associated with delivery time, 
employee training, and customer service. 

Development 

DIAD development began in 1981—around the same time the first PC hit 
the market-when UPS established a research and development unit to 
dream up ways to automate various processes. At that time, computers 
were a novelty, and UPS drivers had kept records using paper and clip¬ 
boards for more than So years. 

Executive management was skeptical about the change, if learning 
new technology made the drivers even slightly less efficient, it would 
greatly raise the costs of serving customers. After several years of 
experimenting with process automation and mobile devices, the 
development team came up with a design that was so well-received by 
drivers and management that it seemed obvious in hindsight. As a 
result, UPS used a clipboard metaphor for its device. "Part of that 
design was to make it familiar, and not scary," Saizman says. The clip¬ 
board design, which was rolled out In 1991 as DIAD I, let drivers elec¬ 
tronically capture delivery Information, including signatures, before 
touch panels ever existed. 
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Drivers had to learn how to use the handheld computer, but they 
were spared training in other areas. "Every time we added a new service 
It was yet another thing drivers had to learn to do ” Salzman says. "Mow 
all of that complexity Is in DIAD, so the driver doesn't have to be an 
expert in Early am,' 'International Shipments , 1 and all of the other serv¬ 
ices we offer." 

The DIAD also allows for on-demand dispatching, so drivers don't 
have to strictly follow a route. They can now both keep to their routes 
and take on-demand pickups. Prior to DIAD, UPS had dedicated fleets 
for on-demand pickups. "Now we can use our regular fleet more effi¬ 
ciently," Salzman says. 

UPS anticipates benefits in the form of driver efficiencies, durability 
resulting in fewer field failures, and network savings from using packet 
data radio rather than cellular to transmit real-time data (although 
cellular remains the backup mode of transmission.) 

Each DIAD sends delivery information to a PC at one of several UPS 
operating centers, which Is networked to one of many mainframes at 
UPS's Atlanta and Mahwah, NJ data centers. There are 13 IBM CMOS 
mainframes and two IBM 2 -series mainframes at work, although not all 
of them are used for delivery tracking. 

UPS boasts the two largest IBM DB2 databases in the world. The 
largest of these holds more than 11 terabytes of data. The company must 
keep track of 13 million packages a day and save that Information for 18 
months. This information is available to anyone who wants to track a 
package, whether via phone or Web 

The Web technologies UPS uses for package tracking include CGIs 
written in C++* and Java application servers. The system supports HTML 
and XML delivery of tracking information using a variety of applications. 


Using Available Technologies 



When the DIAD I went Into production, UPS had to develop all of its 
software; the operating system, the middleware, and the applications, 
“Back in the early days we had a lot of people slaving away on the DIAD 
because there was precious little software on the market that we could 
use” Salzman says, “We got bare metal and had to build ail of the soft¬ 
ware ourselves" 

The first version of the devices had 075MB of memory and could 
scan bar codes, program routes, maintain timecard information* and 
tally cash-on-delfvery exchanges. Unlike later generations of the 
DIAD, the device uploaded delivery information at the end of the day, 
upon the driver’s return to the UPS branch. By the time UPS sent its 
DIAD III hardware designs to Motorola, a variety of software was on 
the market. The current version of the system, DI AD III, is based on 
the 32-bit, 48MHz. MPC&23 PowerPC RISC processor. This has four 
times more memory (6.5MB} than its predecessor, DIAD II. It runs on 
PSOS, an embedded operating system, and uses Vermont Views soft¬ 
ware for displaying data on a small screen. 

“Even since 1999, more tools have become available for wireless 
mobility" Salzman says. ‘For each generation of DIAD we’ve had to 
build less and less software because more and more is available off 
the shelf." 

The third generation system* DIAD III* was developed at a cost of $ioo 
million. It’s nothing to sneeze at* but consider that it cost UP 5 about 
$350 million to develop and deploy the first DIAD in 1990. “To evolve a 
product is a lot less expensive than inventing a product” says Salzman, 
Such a massive system requiring dedicated resources is a necessity for a 
company the size of UPS, which has a technical staff 4000 strong and 
spends about $1 billion a year on technology* 


While UPS has pretty much perfected its DIAD system, improvements 
continue. “We have a strategy of refreshing our technology every five 
years” Salzman says. He declines to discuss any new features of DIAD 
IV, but says the device will be based on Microsoft's Windows CE operat¬ 
ing system* 

Pacific Stock Exchange 

While DIAD is an example of large-scale deployment, there are 
certainly different requirements for smaller environments and other 
Industries. The stock exchange industry, for example, has been experi¬ 
menting with wireless technology recently In an attempt to make 
trading floors more efficient. 

While UPS drivers are scattered around the world with their handheld 
devices* 500 Pacific Stock Exchange traders share a 25*000 square foot 
space in downtown San Francisco, Trading information traditionally 
comes from printed tickets and phone conversations, which don’t 
necessarily lend themselves to accuracy. Human errors are readily intro¬ 
duced to this system, such as those that occur when a broker is unable 
to read some handwriting or understand Information printed from a 
machine that’s low on ink. 

Nearly three years ago, PCX Implemented a handheld system to help 
investors, broker/dealers, and registered member firms more efficiently 
buy and sell over 1800 stocks, bonds and other securities, and options 
on more than 800 stocks, 

“We're in the business of processing high volumes and high levels 
of transactions* 1 says Wayne Hicks, vice president of development at 
PCX. "We needed to get to an environment where we could handle 
high volumes, but also eliminate some difficulty in getting trades 
cleared." Due to the open outcry environment of PCX and the need for 
traders to move freely about the floor, the ideal solution had to be 
one that wasn't tethered m any one location: "That drove us to a 
wireless technology,” Hicks says. 

Infrastructure and software 

The PCX devices are comprised of two groups: the Floor Broker and 
Market Maker handheld systems. About 100 people use Floor Broker 
handhelds and another 400 use the Market Makers. Unlike UPS, which 
formed a research and development unit devoted to m-house techno¬ 
logical development, PCX outsourced development of the Floor Broker 
handhelds to Micro Design Services, a New Jersey-based wireless solu¬ 
tions firm that devoted six people to the project. Mitsubishi manufac¬ 
tured the hardware, and upgraded models are expected to hit the floor 
during the first half of next year* 

Hicks declines to reveal the cost of the system deployment and main¬ 
tenance, but does say that the equipment drives much of the cost for 
implementation and maintenance He also notes, "The difference 
between now and three years ago is significant in terms of the cost of 
devices. Then, devices were twice the cost they are today." 

Weight was an Important factor in the design, as traders must hold 
the devices upright for long periods of time. Early devices on trading 
floors were large and heavy, with a short battery life. 

The PCX devices communicate with a centralized Hewlett-Packard 
(HP) server running on the HP-UX (Unix) operating system. Although the 
device's hardware hasn’t changed in three years, the software has gone 
through about five revisions. 

Micro Design develops and maintains the software based on PCX’s 
requirements and specifications* and on suggestions from the traders 
themselves. “They may want a sound played louder or the screen 
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displayed In a different way” Hicks says, ‘or additional functionality for 
complex orders. 11 

While PCX, with help from Micro Design, pretty much controls the 
design and maintenance of the Floor Broker system, the devices on 
the Market Maker system are developed and maintained by the firms 
trading on them. 

A third organIzatron. the San Francisco-based Cutler Croup, has about 
40 devices at work between PCX and the Chicago Board Options Exchange. 
The firm has developed its own software, called Denali, to run on 
Pentium Ill-based, touch screen devices. "We try to use the technology 
to take strategies and thoughts regarding the way the market works and 
somehow codify that information into The application base/ 1 says Dun¬ 
can Wilcox, chief information officer at Cutler Group. Rather than have 
traders put together six pieces of data to make three decisions, the 
devices act as the total window to what needs to done, based on Cutler 
Group's own proprietary evaluation model. 

Denali started out as a fat client, and has seen its feature set grow to 
outstrip the memory size of the devices on which it runs. Over the years 
R has been reworked Into a thin client, a request and display tool that 
communicates with a series of different servers sitting on separate 
machines on the network, 

Denali works on seven different servers: 

1. The risk server handles analysis tools; 

2, The position server shows what’s In a portfolio at any given time. 

3- The order server is used for buying shares. 

4* The destination server translates data to meet industry standard 
requirements in various trading environments. 

5. The quote-feed server manages all real-time information, 

6. The rules server defines automated tasks. 

7. The report server prints reports. 

The Intel-based servers each have a back-up, and are based on a 
mixture of operating systems from Windows NT to Linux, 

Using Denali-based devices, Cutler Croup's 40 traders get unique 
views of how markets are changing, with cool graphics like briefcases 
whirling around in blenders. More importantly, order routing and execu¬ 
tion capabilities are digitized. “Seven or eight years ago the predomi¬ 
nant form of trading was floor trading, where people yell and scream in 
a pit" Wilcox says. “Today, guys are standing there staring at a screen 
clicking away, and it can get executed from San Francisco onto the elec¬ 
tronic exchange in New York” 

Cutler Group continually updates Its device technology as the stock 
markets change, "They’re bringing out single-stock futures, which they 
never had before/' Wilcox says, “and we re looking at better ways to 
graphically represent larger amounts of information" 

Opportunities 

When companies like UPS and Cutler Croup take on the task of develop¬ 
ing their own business technologies, surely they must consider whether 
the devices can be marketed elsewhere. Cutler Group's Wilcox says the 
thought has certainly crossed executives 5 minds, “it’s something that’s 
attractive at some level/ 1 he says, “but we developed [our device] 
specifically for the way Cutler looks at the world.” The challenge in 
marketing your in-house technology is in meeting the lowest common 
denominator—how to make the technology work for any company. “VVe 
would have to be more interested in working with companies somewhat 
similar to ours," he says. 


Nevertheless, despite some Industry similarities, all companies tend 
to operate in their own unique way. What works for UPS, the PCX, and 
the Cutler Group may not be the right solution for firms in completely 
different industries. If you're serious about deploying a wireless hand¬ 
held system to help run your business, start by checking with various 
hardware vendors, such as Motorola and Mitsubishi, to find out how 
they work with companies to design customized mobile device systems. 
Or contact a wireless consulting firm that specializes In evaluating and 
developing such systems. 

Make sure that no matter what company you contract for design 
and manufacture, you obtain insight from your employees on how 
the new technology can both fit in with and improve current busi¬ 
ness processes. It took the UPS technical team quite a few years of 
working with delivery drivers to come up with the clipboard design 
for the original DIAD. With new technologies available, it may not 
take that iong to develop a system today, but don't expect miracles 
overnight. Lots of research, good insight from industry experts, 
vendors, and employees, and wise investing, are all requirements 
when creating your company’s own recipe for the ultimate network 
appliance. >< 

Amber Howie is a freelance business end technology writer hosed In San 
Froncrsco, She has contributed to various technology publications includ¬ 
ing CRN, intelligent Enterprise, and relearn neet. You can emaii her at 
0 m be rs to r ©ea rth/in fc. net 
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Building Devices that Communicate 


Amlt Asaravala 

In a May 2001 Scientific American article, Tim Berners-Lee* James 

Hendler, and Ora Lassila outlined a vision for the Web's future. Dubbed 
the Semantic Web;' their vision focused on software applications that 
could share data with one another. For instance, the calendar program 
on a personal handheld device could check with the master calendar 
application at a doctor's office and automatically schedule a patient's 
next appointment. Taking this vision one step further, it's not difficult 
to imagine that stereos may one day automatically check the servers of 
major recording companies and download the latest hit singles. Like' 
wise, a car's in-dash computer could send information about stress, 
heat, and mileage levels back to the automobile manufacturer where 
that data could be processed to determine the next tune-up date. 

These ideas certainly aren't new. Berners-Lee had in fact written about 
the semantic Web as early as 199 a. And for decades, we've drea med of the 
day when computers would handle most mundane tasks, freeing our time 
for other activities. While there are great human benefits to the realization 
of these ideas, advancements in this area have been slow and often owned 
by a small number of private organizations, The problem with such propri¬ 
etary research is that the results mostly remain In the hands of the invest¬ 
ing companies, Interoperability across different products and devices 
becomes limited by competition and the overhead of forming complicated 
partnerships. If we were to continue at this pace and with this methodol¬ 
ogy, chances are slim that the semantic Web would ever be realized. 


For companies to create devices and software that interact with the 
least friction, the entire industry needs to agree on one set of rules for 
communication,The importance of having a standard protocol in this 
space relates to a principle that economists call a network effect. The 
value of products (and technologies) that are subject to network effects 
Increases as more people adopt them. As more and more companies 
create applications that use the same protocol, the protocol’s value 
grows. Right now, with the many scattered and proprietary technologies 
that organizations rely on for data transfer, not one has a great worth that 
compels other companies to adopt it. 

Fortunately, several companies have begun to understand that they 
must work together to create an open standard. Last year, representa¬ 
tives from Microsoft* PevelopMenter, and UserLand Software released 
the specification for the Simple Object Access Protocol (SOAP). Already 
in its first revision at the W3C, the SOAP technical report explains that 
the protocol facilitates the “exchange of information in a decentralized, 
distributed environment" What this really means Is that applications 
and devices that implement SOAP will be able to send Information to 
(and receive information from) other compliant applications and 
devices. This opens the way for the creation of smart appliances that 
can *ta!k s ' to one another to schedule appointments* reorder supplies* 
and generally automate otherwise manual tasks. 

Inside SOAP 

Like most protocols, SOAP isn't a tangible prod¬ 
uct* but rather a set of rules that anyone can 
implement In a software client or server. In 
essence, SOAP lets your applications invoke 
methods on servers* services, components, and 
objects that lie at remote locations on the Inter¬ 
net- While other protocols like DCOM and 
IIOP/CORBA let you do similar things, they're 
limited in that they weren't designed specifically 
for the Internet and for communication 
between diverse companies and devices. Using 
DCOM to communicate between applications in 
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POST /Kitchen HTTP/ 1,1 
Host: www,kitchenserven,com 
Content-Type: text/xml; clwsetVutf- 8 w 
Content-Length: nnnn 
SOAPAction: "SotoeHM 7 * 

<50AF^ENV:Envelope 

xmlns: £OAP-ENV®"http; //schemas. xmlsoap. org/soap/env elope/" 

50A P-ENV: encodl ng S tyle- 71 http: //s chema s. xmls oap. or g/s sap/encodl nq / p > 

* SQAFHEWV: Body? 

<jh; Get Appliance? emperature imlns:m= ±w &omeHJRI "> 

<appllan c e?o ven</eppVt anee> 

<fm :GetApplIanceTemponature? 

■c/SOAP-EHV: Body? 

</SOAP-ENV: Envelope? 

two separate companies is a difficult task that first requires agreeing on 
ports, transfer protocols, and so on. SOAP, on the other hand, sits or top 
of existing HTTP connections. As most companies have Web servers 
configured for HTTP connections on standard port SO, most of the initial 
coordination is complete, 


Of course, companies will still need to share APIs for available objects 
and methods; but SOAP lets people focus on these APIs and the data 
that needs to be transferred, rather than on the trouble of getting two 
disparate systems to communicate. All you need is a 5 GAP-compliant 
client application on the one side and a SOAP-compliant server on 
the other, The server could be as simple as a Web server that checks the 
headers of incoming HTTP requests, If it finds a POST statement with a 
text/xml-SOAP content-type or SOAPAction header, It sends the state¬ 
ment to a SOAP engine that parses the command found within. 

There are numerous SOAP implementations available, including 
SOAP::Lite for Peri, which was discussed In the Way 2001 issue of Web 
Techniques (see “Applying .Met to Web Services* by Brian jepson In the 
“Online Resources" box). Apache SOAP is a Java implementation for 
the Apache Web server* based on code from IBM, Microsoft has also 
released the SOAP Toolkit, and has incorporated SOAP into products that 
are related to its Met framework. Likewise, Sun has agreed to adhere to 
the SOAP specification In future products. Support from major vendors 
means that most applications on the market will be able to compose 
and understand SOAP messages. 

Typically, SOAP messages are XML documents embedded in HTTP 
requests, in a Web server model, a request is PGSTed to the server, 
which then sends a response back to the client. Listings 1 and 1 show a 
variation on the request and response messages provided in the original 
technical report. In the report, the request message checks a stock 
quote server for the latest price on symbol DIS. The response, once 
parsed out of the XML document, is floating point value 34 - 5 * Her £> >' ve 
modified the text of the SOAP request (Listing 1) so that it queries a 
fictitious kitchen server for the current oven temperature. The SOAP 
engine accesses the GetAppiianceTempe rat lire method, passes it 
the oven parameter, receives the resulting value, and then returns the 
response message (Listing 2). 


HTTP/1,1 im ok 

Content-Type: text/xrni; charset*" utf-8“ 

Content-Length; nnnn 

<SOAF t -EJW; Envelope 

xmln s:50AP-FNV-“http://schemas.xmls oap.or g/soa p/fcn vetope/■ 

SOA P-ENV: ertcadl ngStyle-"http: //s chernas. wnlsaap. org/snap/encod 1 nq/ w /> 
<50AF-ENV:Body? 

<m : GetApptl anceT emperatureRespn nse xmln s: n^" Some-URI "> 

<t emperat up e >3S0< /teupe ra tu re> 

</lt»: GeiApplI a nee Temper atureRe spons e? 

</SDAP-ENV:B 0 dy? 

</50AF-ENV:Emelope? 


Note that SOAP messages have three parts: an envelope, an optional 
header, and a body. In this sense, the structure is much like an HTML 
document. The envelope Is the root element, and the header and body 
are children, 1 haven't used a header In Listings t or 2, but In some appli¬ 
cations a header is beneficial because it can contain valuable informa¬ 
tion about the message itself. If your SOAP 
I engine were set up to handle incoming 
requests based on the status of each message, 
your header might indicate whether each 
message had a high, normal* or low priority. 

This would be especially useful if your kitchen 
server needed to turn your oven down after 
I realizing that the temperature was too high. 
Likewise, the software at an auto manufacturer could Issue a command 
to stop a paint hose if the nozzle became dogged. 

The body of the message contains the name of the method that’s 
being accessed and any parameters being passed In. The examples so far 
have been fairly simple, requiring only one parameter, but keep in mind 
that SOAP is very much like DCOM in that it can serialize entire objects 
and pass them to another application where they T re deconstructed and 
acted upon. 

Although most examples rely on HTTP as the transport protocol, the 
SOAP server doesn't necessarily have to be a Web server. Because SOAP 
messages are XML-based, they can be sent over other transport proto¬ 
cols, like SMTP* if necessary. A properly configured mail server or mail 
processor could check the content-type header of incoming messages 
and execute commands accordingly. This makes SOAP powerful for use 
with small devices that execute commands remotely, but don't need to 
receive an immediate response. For instance, you could send an email 
message from your RIM Blackberry pager, commanding your home 
server to add a new name and phone number to your contacts list. 

Embedded Applications 

Because SOAP is a specification* and not an actual software package, 
developers are free to implement it however they need to* provided that 
they follow the rules. This has led to the creation of several third-party 
packages that can be used to SOAP-enabte Web servers (as mentioned 
above) and client applications. Packages hide the implementation 
details so that users can focus on building their applications, rather 
than on adhering to every SOAP detail. 

Another plus about packages Is that each one can have different 
strengths and weaknesses, like speed* language, and portability. This Is 
especially important for network appliances, which often have little 
memory and slow processor speeds so that the overall hardware can be 


There’s just no room for 

complete Web servers and overstuffed 
applications on Internet appliances. 
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small and inexpensive. There's just no room for complete Web servers 
and overstuffed applications on these devices, so special implementa¬ 
tions need to be built with respect to overall size. 

Embedding.net has been developing an embeddable SOAP (eSOAP) 
package, a lightweight implementation specifically for network appli¬ 
ances. The classes come in C++ or java versions and are split into three 
groups: the core set, the classes for client applications, and classes for 
server applications. The HTTP portion of eSOAP is handled by the Abyss 
Web server, which listens for any incoming requests and sends SOAP 
messages to the eSOAP application for handling. Vou could also use 
Apache 1,3 with the mod_esoap module. 

The entire eSOAP engine has a memory footprint of only 150KB. For a 
relatively minimal hardware investment, television and stereo manufac¬ 
turers could begin incorporating eSOAP servers and exposing the corre¬ 
sponding methods for turning the device on and recording various 
channels or stations. Or, the devices themselves could act as clients and 
transmit data to central servers. This latter model has the advantage of 
saving even more resources on the device by pushing most of the CPu- 
intensive operations to the server. 

Realizing the Vision 

Most of the discussion about SOAP and the semantic Web centers on 
software and protocols, but rarely on hardware. This is because the 
hardware exists and has existed for a while now. Most technically 
savvy companies already have Web servers and intranets, and manu¬ 
facturers are adding data connectors and TCP/IP stacks to devices like 
handhelds, digital video cameras, and game consoles. And telephone 


lines and Ethernet jacks are no longer the only options for moving 
data from place to another—Infrared, 802.11b, and Bluetooth are 
among the possibilities. Certainly great skill and effort goes into 
hardware design and development, but it wouldn’t be a leap to 
embed Internet connections and lightweight operating systems on 
many consumer appliances. 

Because it + s Infrastructure agnostic. SOAP is positioned to become 
the de facto standard in communications. With its refiance on 
common protocols and languages such as HTTP and XML, SOAP 
promises to reduce the amount of coordination and development 
traditionally necessary to facilitate communication between two or 
more devices. 

In addition to its uses for network appliances. SOAP is being 
touted as the enabling component for Web services. This is one of the 
major reasons behind Microsoft's involvement with the SOAP specifi¬ 
cation. The company's .Net frameworks will use SOAP messages to 
send information between companies that have agreed to share data. 
eBay has already agreed to use the Net framework to open up its 
auction databases. When the technology is in place, developers from 
other sites wifi be able to write auction applications that rely on live 
data from eBay’s central database. 

In essence, SOAP is enabling the Web that we don't see. It's the tech¬ 
nology that will help us realize a semantic, invisible Web that runs in 
the background, doing our bidding without our constant attention. So 
long, Web browsers. >< 

Amlt Is Editor in Chief of Web Techniques magazine. 
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MULTILINGUAL 

METHODS 


HTML filtering got you down? No matter what your preferred develop¬ 
ment language, an ad hoc parser is an easy way to tame your input. 


A l Williams 



Parse for 



In last month's "java ©Work” column, 1 showed you how 

to use JavaCC to build parsers for text processing. Although 
JavaCC is very powerful, sometimes it's more than you need. 

Suppose you wanted to strip fancy formatting from arbi¬ 
trary web pages to make them more usable from a PDA, or 
another network appliance with limited display capabilities. 
You really wouldn't care much about the exact document 
structure in that case—you’d only need to pick out a few 
key tags, ignore comments, and extract the text. 

Although you could write a full-blown grammar for 

JavaCC orYACC, that approach seems like overkill in this 
instance, A better solution would be to write a simple ad 
hoc parser, one that can read the HTML and process it the 
same way you might by hand. 

The downside of this technique is that it can be difficult ! 
to get precise results and make complex changes. On the 
plus side, it's simpler to understand, and it's generally 
applicable to any language. JavaCC by comparison, is very 
lava-specific. 

Implementing A Parser 

For flexibility, 1 decided to tackle this problem in Java with 
a parser that accepts an InputStream. That way, 1 could 
parse a file, a Web site, or anything that can be converted 
into an InputStream, (just because 1 wanted an easier 
way to do things doesn't mean that \ didn’t want to reuse 
my finished code.) 1 wrote a general purpose class that 
contains all of the parsing logic, which you can see in 
Listing i. 

Mow when you want to pick apart a Web page, you need 
only extend this general purpose class. If you pass the base 
class a URL, it repeatedly calls dDElement for each tag or 
text item in the VYeb page. Of course, that means you'll 
override doEleinent to perform the processing you require. 

Each call to doElement receives either a tag or a non¬ 
tag Item (you can tell the difference because the tags start 
with an angle bracket). Once doElement runs out of input, 
it receives a null. This is useful if you’d like to write out 
any dosing information. 

In Java or any other language, the ad hoc rules for parsing 
HTML are very simple. The program examines the first input 
character to determine whether it’s an angle bracket. If it Is, 
the next input token will be an HTML tag. If it isn't, then It 
must be some text. The ender variable holds the character 
that signifies the end of this token. In the case of tags, 
ender is a closing angle bracket For anything else, the end 
character is the open bracket (which would signify the 
beginning of a new tag). 

One problem with writing this sort of parser Is that you 
usually examine one character too many, because the stop 

character Is actually the start of the next token, Java has a 
PushbackReader class just for this purpose, but 1 decided 
to keep my own score of the last character—in the imagina¬ 
tively named lastchar variable. 

Ordinary text and tags are easy to read, but comments 
take more work to Identify. HTML supports two forms of 
comments. Old-style comments, which look like this; 

<! I am an old corrment > 

And newer comments, which start with < 1- and end with 
—like this: 

<1-1 am a newer comment -> 

The old-style comments are just like any other tag, but 
the new-style comments require special handling. The 
parser must read ahead enough to know what kind of 
comment it’s dealing with, and then set the value 
of multi comment to either nue or false, accordingly. 

Ad Hoc Details 

With the preliminary reading done, the parser simply loops 
until it finds an ending condition. Because of the new-style 
comments, this while loop is slightly more complicated 
than you might guess: 

while Uc'-ender && !multi comment)11 
(multicomment && c=ender &S- 
dashct!=2) II 

(multicomment && c!=ender)l l 

In plain English, the loop continues as long as any of 
three conditions are met. If multi comment Is false and 
the end character hasn't been read, it continues, if multi- 
comment is true, it continues only when the end character 
either hasn’t been read, or has been read, but the last two 
characters weren’t dashes, 

Obviously, to manage this the code needs to count 
consecutive dashes. It must also keep track of the 
endoffset variable, which indicates whether the last char¬ 
acter read is part of the current token (as in the case of a 
tag) or the start of the next token. In addition, you'll notice 
that lastchar begins with a value of-2, indicating that 
there was no last character. I would have used -i t but 
because that Indicates an end of file, 1 wanted a different 
value. 

The code uses a Stri ngBuf f er to build the token. This is 
more efficient than a String, as the code can directly modify 
the object instead of creating multiple String objects. 
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Prevent and detect errors in your 
dynamic site—automatically 


Storing State on 
the Server 

By Adam Kolawa 


Web applications do not carry state from 
page to page; each instance of each page is 
displayed independently of what happened in 
the past. In other words, there is no way to toil 
how one page carries to the next. As a result, 
one critical decision you need to make as you 
develop your Web application is where you 
should store state information. 

If you need to store critical data such as 
customer profiles or purchase information, your 
best bet is to store it in files or databases. 
Storing information in tiles and databases is 
slow because you need to access the file or 
database every time you need to record, update, 
or access state information; however, if you are 
working with critical data,, it is probably worth 
sacrificing speed to gain security. 

If you are less concerned about the state 
information you are storing, you might want to 
store it using dynamic/short-term memory 
(such as servlet/fast cgi “global" variables) or a 
server cache. Both of these methods are faster 
than storing state in files or databases, but they 
are less reliable. When you store state in 
dynamrtfshort-term memory, the state is held 
as long as the servlet is alive. It the program 
whose global variables store the state 
terminates unexpectedly, the state information 
is lost and there is no way to retrieve it. When 
you store state using the server cache, the main 
problem you encounter is that cache behavior 
varies from server to server, sc your application 
might behave unpredictable 

If you store state an the server using 
dynamic/short-term memory, you need to 
carefully think about who owns the state and 
how long it wilt persist. If you store a large 
amount of information for each logged in user 
and never expire the state, you will certainly run 
out of memory eventually. II you do expire the 
information, you need a clear recovery path if 
the user does access the application after you 
have tinned out their data. You also need to 
account for the possibility of missing or 
corrupted data not letting you restore state, and 
Ifie server being reset between requests, etc. 


Adam Kolawa. Ph.D., is Chairman and CEO ol 
ParaSoft. You can reach him ai ak^para sort.com 


WebKing™ is a comprehensive tool that helps 
dynamic Web site developers and testers improve 
site quality and development process efficiency. 
WebKing automatically exposes load, con si ruction, 
functionality, presentation, content, and design 
problems on your site. Paths are created 
automatically so you can thoroughly test your 
dynamic site without writing a single script. 
WebKing also provides an infrastructure that lets 
you automatically deploy and lest any back-end 
component and related output pages. This helps 
you thoroughly test programs as soon as ihey are 
developed so you can spot critical problems early 
and repair them before they lead (o further errors. 
With WebKing, you can automatically perform the 
following testing techniques. 


Functionality Testing 

You can also perform two types of functionality 
testing with WebKing. First, you can check whether 
critical paths through your site contain errors, Just 
specify the functionality you want WebKing to test by 
extending the automatically-generated set of inputs 
and paths, then Web King will create and test that 
functionality. Second, you can test if appropriate 
pages contain specific content and design elements 
(such as buttons, text, images, etc.) by having 
WebKing automatically create and enforce rules that 
check lor the presence of these elements. These 
rules describe elements in such a way that 
intentional changes (like a calendar that, highlights a 
different date on a daily basis) are not falsely 
reported as errors. 


Construction Testing 

Each potential path through a dynamic site might 
contain different problems, sc you need to create 
and test a virtually infinite number of paths to 
thoroughly test your site's construction. Just click 
a button and Web King automatically designs, 
traverses, and tests a wide variety of realistic 
paths through the site. These tests expose 
problems such as serviets that throw exceptions, 
CGIs that core dump, databases that crash, and 
errors thal affect data input, presentation, and 
navigation; they also enforce coding standards 
that prevent errors. 

Load Testing 

WebKing's load testing feature lets you find 
a wide range of load-related problems with 
the dick of a button. WebKing automatically 
creates and traverses the requested number 
and type of paths through the site, then 
reports where user traffic could cause 
functionality problems, bottlenecks, and 
program failures. Load-related problems 
are often a symptom of critical 
algorithmic problems; if you use £ 
WebKing to start load testing early 
in the development cycle, you can 
spot these algorithmic problems 
immediately and prevent them from 
creating additional problems 


Regression Testing 

You cart maintain your sites integrity by performing 
automatic regression testing, WebKing saves your 
test cases so every time you modify your site, you 
can verily that it's still correct by clicking a button. 
Or, you can integrate batch mode WebKing into your 
nightly builds to ensure that new errors are found 
and fixed immediately. 

Try it Today 

To improve Web site quality and speed up your 
development process, download a fuIly-funational 
demo of WeoKing today at www,parasoft.conVWtlO, 
or calf (888] 305-0041 for more information. 
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You really only need to implement the parse 
method to use this code. You could repeatedly 
call it until it returns null. However, I wanted 
to wrap the entire logic in the base class to 
avoid rewriting the same code repeatedly. 

The ppocessURL method requires a 
String containing the URL you want to parse. 
It uses the URL object’s open Stream method 
to retrieve an InputStreamthat corresponds 
to the URLs document. Then it calls parse and 
passes each token (including the final null) to 
the doElement routine. By default, this 
method only prints the token out, but you can 
derive a new class to do anything you like. The 
class also has an example mai n routine, so 
you can test it from the command tine: 


java AHPanse http://www*webtech~ 
niques-com 


Using AHParse 

My goal was to strip down complex Web pages 
into a format more suitable for appliance-style 
devices. With the HTML parser working, that’s a 
relatively easy job. i can process the page like 
this: 


Step i. 
Step 2 . 
Step 3, 


Step +. 
Step 5. 


Step 6* 
Step 7. 


Step 3 


Emit a standard Web page header. 

Strip JavaScript links. 

Convert <TABLE> and <TR> tags to <BR> 
tags. 

Convert <TD> tags to spaces. 

Translate <IMG> tags to special hyper¬ 
links. 

Ignore everything in <SCRIPT> tags. 
Pass only <B>, <PRE> r <P>, <BR>, and 
<A> tags (and the corresponding dose 
tags). 

Let a list of images block (for example, 
1-pixel GIFs used for formatting). 


The completed code appears in Listing 2 
(available online). It's relatively straightforward, 
with only a few twists. One problem is that 
HTML isn't case sensitive, but some items 
inside the HTML (like URLs) are case sensitive. 
The doElement method makes an upper case 
copy of the token and uses it when matching 
text. But when it extracts a portion of the text, 
it uses the original string. 

If the program lets a particular tag pass 
through its filters, it should also pass the corre¬ 
sponding dosing tag. That's why, before check¬ 
ing for a pass-through tag. the program 
executes the following line: 


If (tag.charAtl0)=V) 
tag=tag. substring (1); 


This technique could be useful in 

other situations too. For example, a text- 
to-speech program for reading Web pages 
aloud might use this parsing method to 
break the page into simpler items. 


This way. the rest of the code can check for 
the base tag only. In other words, after this line 
executes, both <A> and </A>tags will match a 
test for A, 


Handling Images 

I didn’t want images duttering up the page, 
but I still wanted users to be able to view an 
image if they so chose. The solution was to 
convert < 1 MG> tags to hyperlinks. The program 
makes the hyperfink text equal to the ALT 
attribute of the image, if there is no ALT text, 
the base name of the 


Attribute Parsing 

Although AHParse does most of the work, 
one parsing job belongs to the derived class. 

To properly transform the hyperlinks and 
images, the program needs to extract attribute 
values, like HREF and SRC attributes. That’s 
the purpose of the extract Attribute 
method. This method requires three argu¬ 
ments: a string, an upper case attribute 
name to extract, and a default value that's 
used if there is no attribute. So you might 
write: 


image file appears. 
Figure i shows a 
portion of a processed 
Web page on which 
you can see the image 
links. 

I noticed very early 
that many sites use 
spacer graphics that 
are everywhere on the 
page, but have no 
meaning in this text- 
only view. Therefore, I 
added an ignore list 
that searches for the 
base name of the 
image. If the image 
name is on the list, 
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figure 1 


The Web techniques home page stripped to Its 
minimum toon. 


the program drops that Image from the final 
output. The program reads the list from any 
additional command line arguments. So to 
view the magazine’s Web site, you might enter: 


java WebParse http://www« 
webtecbniques.coTTi pixel-gif 


Another problem with linking to an image 
occurs when the image appears in a hyperlink 
itself. I modified the program to detect when 
it was parsing through a hyperlink (the 
inanchor flag). When this flag is true, the 
program doesn’t emit a hyperlink for the 
image, just the image s text, This prevents 
nested hyperlinks, which can be confusing. 


String hrefurl= 

extractAttribute(token, 
"HREF=","NoLInk.htm"); 


Notice that the HTML author can provide an 
empty attribute (such as HREF-""), which won’t 
return the default string, because the attribute 
isn’t missing: It’s simply empty. 

Parsing the attribute value is tricky because 
there are three cases to handle: 


* An attribute value with no spaces doesn’t 
require quotation marks (HREF=x .htm); 

«the value may be enclosed in single quotes 
(HREF=' x .htm 1 ); 
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import java.-to. *; 

Import java.net.*; 

Import java.util.*; 

public class AHParse ( 
int lastchar=-£; 

StrlngBuffer current; 

public String parsetInputStream is) throws lOExceptinn £ 
int c; 

int ender='<'; 
int endaffset=S; 
boolean 1ntag=false; 
boolean multi common Walse; 
int dashct"0; 
c»lastcriar; 

cgrrent«n ew 5tringBuf f er 11; 
if tc™-S) c“1s,rEad[); 
if tc< 0 ) return null; 
if ( 

1 ntagMirue; 
endoffsat*!; 
enden='>'; 

current.append IScMarlc); 
c^s.read< ]; 
if lc“"!") ( 

current.append(fcharic); 
c*1s.ream J; 

If tc—'-') { 

current.append 4(char)c); 
e=is.read(); 

if £c=’- 1 ) multicomnent=true; 

> 

> 

// read to end 

While (ce*=ender£& [irhiltlcorrmentJ 1 [ 

(multictwinent c=ender &S dashctl-EJ >1 

(multi cofrment ci-enderJl i 

current.append((char)c); 
c^ls.readu; 


If fc=-ll t 

endof fseti=0; 
lastchar™-l; 
break; 

} 

if llastchar='-’) dashct++; else dashct=0; 
lastchar=c; 

} 

while (endoffset-3=0] < 

current.append((char)el; 
la stch ar >=c=i s. r ead C1; 

) 

return current.toStringt>; 

) 


public void prccessURUString urlstring3 throws 
HalformedURlEKcepti On s IOExceptlon £ 

URL url=new UEKurt string); 

InputStream is=url.□penStreaml); 

String token; 
do ( 

token^arseMia); 

// pass null to doElement to Indicate EQF 
if fSdoElementttoken)) break; 

) while (token!=nuVU; 

Is.closeO ; 

> 

// Qverride in subclass 
public boolean doElement(String token] ( 
if (token^null) return true; 

System.out.printingtoken); 

System.out.printIn L"###"); 
return true; // keep going 
J 

public static void main (String- argsm throws Exception ( 
AHParse parser = new AHParse U; 
parser.prcces&URUargstH]]; 

i 


* and the value may be enclosed in double 
quotes (HREF="x .htm"). 

The code assumes that an unquoted attrib¬ 
ute ends with a space or a dosing bracket. And 
any quoted attribute should end with a match¬ 
ing quote. 

Improving Usability 

After experimenting with the LJebParse class 
for a while, I started to think about how it 
could be made more useful. The logic regarding 
the hyperlinks and the images can’t be easily 
changed. However, I thought it would be nice if 
there were an easier way to specify which tags 
the program lets pass through. 

The result was WebParseZ (available 
online). This version requires two command- 
line arguments: the Web site, and the name of 
a property file that contains those tags to pass 
through. Additional command line arguments 
make up the image exclusion list, as before. 

Expansion Ideas 

It would be interesting to build this into a 
proxy server (see the April 2001 “Java^Work*’ 
column for more about proxy servers). Then 
Web appliances could access any site through 
this proxy to fetch a reduced page view. 

You could probably devise smart ways to 


represent other tags simply as well. The current 
method of handling tables is simplistic, and 
frames don't work at all. It wouldn't be hard to 
replicate <HR> tags, among others. Finally, it 
might be worthwhile to filter out some tag 
attribures-something that the current code 
doesn't attempt. For example, you might 
remove any color information from <fONT> 
tags, leaving only font size. 

This technique could be useful in other situ¬ 
ations too. For example, a text-to-speech 
program for reading Web pages aloud might 
use this parsing method to break the page into 
simpler items. The AHParse class could also be 
used in a Web spider, a link checker, or any 
other automated program that reads HTML 

Other Languages 

Implementing this parser In other languages 
should be no problem. The program doesn't use 
any exotic techniques or libraries, in fact, lang¬ 
uages that use regular expressions might let 
you create more expressive parsing rules. [Add¬ 
ing regular expressions to Java is easy—see the 
January 2001 “java@}Work” if you're interested.) 

Java has an advantage over some other 
fanguages because of how easily it reads a Web 
page. Still, you can get the job done in most 
languages if you know how—it just might not 
be as concise as you want. You can find 3 Perl 


snippet in Example 1 (available online) that 
makes a simple HTTP request [not a full 
request) and processes the input. This isn’t as 
concise as Java t but it’s easy to cut and paste. 
Of course, using regular expressions with Java 
wouldn't be as concise as using Perl, so there 
are always trade-offs. 

The End of Parsing 

When faced with a parsing job, should you use 
an elegant parser created with JavaCC (or tools 
for another language, like YACC or Bison), or 
opt for a brute force, ad hoc method Instead? 
Writing a full-blown parser can be complex- 
especial fy if you aren't familiar with the tools. 
However, a quick and dirty parser like the one l 
wrote here can be difficult to maintain, and 
awkward when you want to extract deep mean¬ 
ing from the input. 

Often, the answer lies somewhere in- 
between, For example, a simple, recursive- 
descent parser can usually handle medium-size 
jobs without much complexity. Regardless of 
which you choose for a project, it's helpful to 
have a few backup techniques available in your 
programming arsenal, >< 

AJ fs the author of many popular programming 
books . You con find him on the Web ai 
wwwMbwilliams.com. 
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Flat files aren’t always a bad thing. Here’s how to keep data simple 

development 

w 

while making It look sophisticated. 


PROGRAMMING 
WITH PERL 


Randal L. 
Schwartz 


Rendering Calendars in HT Ml 


I keep a text-file link from my home page that shows all 

of the places that my crazy conference and training sched¬ 
ule takes me. Each entry is a single line showing a date 
range in day-month-year format, followed by a short 
phrase, and possibly a URL for further information. 

For a long time. I've been meaning to move these public 
schedule items out of a flat file and into a real database. 

Then I could have them display as a nice HTML table calen¬ 
dar, with a link from my page. The convenience of Just edit¬ 
ing the flat file kept me from getting around to It until it 
occurred to me that I could keep my flat-file data source, 
and merely interpret the data as it was. 

I whipped out the documentation for Date: :Manip 
(from the CPAN) and figured out the correct date format 
with which t could compute date ranges. I then wrote a 
regular expression or two to pick up the date range pieces 
and feed them to the Dat e : :Mani p routines for compu¬ 
tation. Within a short time, I had a program that output the 
text of each activity, preceded by every date connected to 
that item, Cool. 

Next 1 had to render it In a nice HTML table. Bleh. I hated 
the thought of even more date calculations and HTML 
wrangling, even though Date; :Manipcan do just about 
everything. Luckily, I recalled stuffing away a note to look 
at HTML^CalendarMonthSimple, and sure enough, this 
module was exactly the ticket. 

But after hooking together the date-extraction logic with 
the HTML rendering logic, ! discovered that my poor little 
ISP's shared Web server was getting nailed each time I 
tweaked the HTML color settings and hit reload. By insert¬ 
ing some simple profiling code, I discovered that the 
Date: :Manip and parsing code was taking nearly 10 
seconds, 100 times longer than the HTML rendering, 

\ was disheartened. But after a brief period of reflection, i 
realized that the analysis only needed to be done once— 
each time I edited my schedule file—which was only once 
every few weeks. I merely had to cache the results. The 
cached data would remain valid as long as it was newer 
than the modification time of the same file. 

in the past, I’ve used Fite: zCacheto perform this kind 
of caching, but the Cache::Cache module family (by the 
same author) has now matured to the point of being useful, 
so 1 chose the newer interface. Once I had completed the 
caching code, everything worked great! 

I also remembered those little URLs in a few lines of the 
schedule, and wrote some code to recognize them and turn 
them into actual links. You can see the result in Listing T 
Lines l through 3 start nearly every CGI program I write, 
turning on taint checking, warnings, all of the compiler 
restrictions, and disabling buffering on STDOUT, Turning on 


warnings was harder than l thought it would be, as I’ll 
explain when I reach the end of the program. 

Lines 5 through 3 bring in the expected modules, of 
these, only CGI comes with Peri. You can find the remain¬ 
der In the cpan. 

Line 10 is the only configuration constant: the location of 
the text file containing my calendar. Lines in this file that 
match patterns of interest will end up on the calendar, and 
everything else will be ignored. 

Lines 12 to 21 figure out which month we’re displaying. Line 
14 grabs the current local time. Lines 15 to 17 obtain the month 
from the month parameter, presuming it's provided and it's in 
range. If it isn't, we quietly fall back to the current month. 
Similarly, lines 18 to 20 grab the appropriate year value. 

Lines 23 and 24 set up the cache connection. The name- 
space and username ensure that I get consistent cache 
access whether I run this program as myself or as the Web 
server user. 

Line 26 holds the hash of events. Actually, It's a hash of 
years, with subhashes of months. Each subhash holds 
arrays of arrays representing tuples of day/event-string 
pairs, so the first two days of the second event above could 
be represented as: 

teventst "Z002 ,T ){"!"} = r 
Ell, "in the Southern 
[IE, "in the Southern 

U 

Line 2B creates a fingerprint for the current event file. 

We II note the file's device number, Inode number, and last 
modified timestamp. If a new file is renamed into this posi¬ 
tion, or if this file is edited in any way, it'll have a different 
fingerprint, and we'll reprocess it. 

Lines 30 to 36 fetch any existing cache value. The first 
item of this cache is a hashref of the previous value for 
%events. The remaining three items are the identity values 
I used to generate that cache, which we compare in line 32 to 
our new fingerprint. If they’re the same, we have a valid 
cache, and we can use what we've seen. 

tf there are no events, however, we presume we didn’t 
have a valid cache. The program then parses the current 
fife, starting in line 3S, Line 40 brings In the slow, but 
powerful Date: :Manip package. Even Sullivan Beck, the 
author admits that loading this module is slow, so the 
program won't load it unless it needs it. Between the 
requi re statement and the call to Import immediately 
following, f have the equivalent of use Date: :Manip, but 
performed at runtime instead of compile time. The call to 
zero-out the path is required because I’m running with 
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fl'/use/bln/perl 
use strict; 

Si ++; 


-Tw 


54 

55 

56 


my E$y T $fti, $dl - UnlxQate 
(S '’XT', "W , "Wi; 
pusn o)CSevenrts{&+$yHGH-Sm>>i [SC, 
(where!; 


use CGI gw I:allJ ; 

usb HTML: :CalendanMonthS1 triple; 

use Cache::FIleCache; 

use URI::Find; 


57 

58 

59 


Icache-*set(’dala 1 
Smourident-ity] ] ; 


11 

12 

13 

1* 

15 

16 


rmy (PLANFILE - Vhome/merlyri/. plan 16 


17 

IB 

19 


ZB 

Z1 

ZZ 

23 


my (Ifanmontti, Sforyear); 

( 

my aHOU - lowltlme: 

(formanth = paramt' merith 1 5 j 
iformonth - (NOW 14 Hi unless defined 
Sfarmnnth arid (formontfl ! - AD/ ana 
Sfonncnth >= 1 and (fonmonth «= 1Z; 
Iforyear = parain("year 1 I; 
ifopyear = (NQWt 51+1930 unless defined 
Jfcryear and (foryear !- AD/ and 
Sforyear ?= 2881 and (farye&r <= zaas; 

> 


50 

61 

62 

53 

64 

66 

56 

67 

66 

69 

70 


my Seal - HTML: :Calendarflonth5iiTpl@-> 
neuiyea'” => (foryearv, month => 
if DrnanthJ; 

$tral->w1 dth C '100%' \; 

$cal->bgcol nr (’white 1 I \ 

Seal-?tadaycolors'grey'I: 
Scal->bor-1ercolar (' black' 1 j 
$cal-?cantfntcnlDr[ r black'); 
(caV->todaycqntintcolnn[* black'); 
Seal—>heaaercalor('fccffcc'>; 


{ 


1) - r 


71 


12) 


72 . 


PA 

25 

26 
27 
26 


my (cache = Cactie: :F1leCac*ie-?new 
[[name-space *=> ‘uhereami 1 , 


73 


1 nobody 1 ) 1; 


74 


rry %ewntfi; 


my Shawl dEntity 


tstat((PLANFIL 6 )> 


If (my (cached «■ Scacfte->getrdata' n ( 
my <(events, Sfidentity) ■ aBcached; 

If t"5how1dentity" eq "fiJidentlty") I 
p# we have a valid cache 
^events - %Sevents; 

> 

> 


75 

76 

77 
70 
79 

30 

31 
8£ 
S3 


my (myself = uni(^relative ■ 
my (previous - sprlntt 

'Ts?year=^a&mnntn^d", (myself, 

Iforrramth = 1 ? ((foryear - 1, 

(Sf ary ear, Sfarmonth - U; 
my (next = sprlntt 

N Ks ? yea r^tdBmont h=W" * Imy s elf * 

Sfdrffonth = 12 ? ((foryear + 1, lj : 

(Ifanyear, Sfarmonth + 11; 

ScaWheader(table({width => '1BBK\ 

border => 0, 

cellspacing 0 t cEllpaddlng => 21, 
TrEtdC(align =? 'left', width -> 'l' 1 r > , 

a((href =? (previous ), " previous"tl, 
td((align => 'center's width-? I* 1 ), 
b (Scal-smonttmame, (eal-?yeari i , 
tdt(align => ” right' t width =? '1*'), 
aUhref -> (next), "next^J J) i); 


unless (Keventsl ( 

ff# no cache, so compute from scratch 
require Date::Man1p; local iEMVCPATHU 


65 

06 


print hEsder, start_html("My Calendar for 
”. Sc al->marithnaii»." ".Soal->year ]; 


Date: :Manip“>1n>port; 
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□MRGV =* (PLANFILE; 

While (<>) ( 
next unless 

A [\d+l \s4t d\s+ ('vdt] ( \s+\S+\s+\dhF> : \s+ ( . * I / or 

/* <\d+-\s+\S+> \s+to\s+ [\0+Ss4-\StV (\s+Vd+l :\s+ r. ) / or 

/ A (\cH-Ss+S34-Vs+\d+As+to\st(\tft\S+V5+\s+\d+l ( ) :\s+( . *]/; 
my (Sstart, Sendj Swhere) = 
rtlS3"/ l $Z$3", S4)j 
Sene = DateCalc itsnd, ''+ 1 0;ay p i; 
for (ParseRecur("every dag 16 , unOef, 

Sstart, Send)] { 


90 


91 


92 

93 

94 

95 

96 

97 


for (3K Se ve nts (&+-$f ary ear MMf nrmonth J > 1 
t 

my (Set, Sufiere) ■ 3S_; 
for ((where) ( 

find^uri 5 ($_j sufi (my (iurl, (texti = 

S_; 

qq(\l<a href="M$uri\l" 
target»_hlank>\meKt\l</a>\l> >!; 
sADI.*7J (?:\001 (. *?1\081) ?/escapBHTMH$l). 
(defined S2 ? $2 : ir ‘J/eigj 
J 

ic-al->addcDnterjt (0+|d, iwherel; 




( local $ n w - 0; print $cai->as_HTML; ) 


9S print enc_html; 


tainr-checking enabled, and the module occa- 
slonaHy wants to call a child process. 

Lines and 44 start the input file process¬ 
ing. Lines 45 through 49 extract the date 
ranges. Because I’m using abbreviated ranges 
in my file, the code needs to parse many varia¬ 
tions, which required careful consistency when 
placing parentheses in my regular expressions. 

Lines 50 through 54 compute every day that 
belongs to a range; first by adding ore day to 
the end of the range, then by generating a 
recurring value thafs true for every day begin¬ 
ning at the start date and ending before the 
incremented end date. For each of these items, 
we add to the event list under the correct 
month's subhash. 

Line 5& stores the newly computed event 
items in the cache along with the signature of 
the data that generated this event list So now 
we have a nice event list, possibly obtained 


from the cache, or else the cache has been 
updated. Time to render it, starting sn line 59. 

Lines 59 to 66 create the basic calendar 
structure, including setting up some of the 
appearance items. 

Lines SB to 81 handle the forward/batkward 
links, which I’ve placed into the title. Lines 70 
and 71 compute the prior month as a link that 
re-invokes this script with appropriate 
year/month parameters, using the URL 
retrieved in line 69. Similarly, lines 72 and 71 
compute the next-month link. 

Lines 74 to 81 adjust the calendar’s header so 
that it's a three-cell table. The center ceil is the 
month name, and the left and right cells are 
the previous- and next-month links. 

Line 84 begins the program’s output, includ¬ 
ing titling the page with the computed month 
and year. Rendering begins in line 86, where we 
pull out the array of arrayrefs for the current 


month s items only. Line B7 extracts the 
specific day and text string for each event. 

Lines 88 to 92 search for all URL-like strings 
and replace them with actual links. The text is 
added to the calendar ted in line 93, 

Lire 96 outputs the HTML for this calendar. 
Unfortunately, it seems to trigger many “urdef 
used as a string” warnings. Rather than track 
them all down, you can just turn off warnings 
during this step. Line 9 * finishes up the HTML 
page, and we’re done. 

Now l have a fancy GUI interface to my 
calendar. A possible addition might be a color- 
coded range of events to distinguish them 
from one another. But that’s for another day. 
Until then, enjoy! >< 


RorrdaJ fmeriynf®5tonehenge.com) bos coauthored 
the must-hove standards; Programming Perl, 
Learning Perl, and Effective Peri Programming. 
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PRODUCT 
REVIEWS 

Beta 2: 

Visual Studio .Net 
Enterprise Edition 

Microsoft 

www, mi crosoft.com 

Pricing TBD 


Microsoft's Golden Road to the 
Internet 

Microsoft's future is bound to Visual Studio 

Net, which is probably why it has made so 
many changes. If you’ve used previous 
releases of Visual Studio, this will be a whole 
new world. If you've used enterprise suites 
from other companies, you have to get this 
one and try it out 

At first glance, everything appears simitar 
to previous development environments: There 
are project windows, code windows, object 
windows, and help windows (see Figure i). 
When you start your first project, let's say 
with Visual Basic (VB), you select the ASP.Net 
project. The project comes up with a form, as 
you d expect. There's a Web form named 
Webformtaspx. And there’s a file called 
GLOBAL.asax, and another called 
WEBl.asax.vb, Vou open up the code and it has 
words like “class" and “imports/ 1 as shown In 
Listings i and 2 (available online). 

So What Happened? 

To understand what happened to VB-indeed, 
what happened to Visual Studio-we need to 
look under the covers and discuss Microsoft’s 
Net initiative. This initiative begins with 
the .Net Framework—essentially a set of 
libraries, classes, and Interpreters that consti¬ 
tute the foundation upon which all .Net serv¬ 
ices and languages are built. The libraries are 
collectively known as the Common Language 
Runtime (CLR) and include fundamental 
programming services such as memory 
management, process management, and 
security enforcement. Part of this library Is 
also a compiler to process language 
instructions. 

This library has a set of classes, known as 
Net Framework Unified Classes, that perform 
systems and programming tasks such as file 
management, system input and output, and 
operating system functionality. On top of the 
systems classes, other classes are built that 
do the things that we really want to get to: 
data classes (ADO.Net}, Windows forms, and 
XML and Web classes. 


The base programming language for the 
entire .Net Framework Is a new language called 
C# (pronounced C-Sharp), an object oriented 
language complete with classes, objects, inher¬ 
itance, and polymorphism. Because Ctt has 
these characteristics, the classes in the Unified 
Library have the same characteristics. 

To make the framework and all services 
available for other programming languages, 
Microsoft created Visual Studio, which lets 
programmers use the language of their choice 
(as long as it's C4, C++, or VB), Visual Studio 
precompiles, ortokenizes, ali of the languages 
into a common format—the Common Language 
Infrastructure (CLI)—that the Framework can 
execute. And because of the Ct? roots, all of the 
languages in visual Studio were modified to be 
object-oriented and to access the Ctt properties 
and methods. 

Web-Centered Development 
with Web Forms 

Visual Studio did more than make an Interpreter. 
It also created new ways of developing that 
more tightly integrate Windows and Web devel¬ 
opment. The main elements In this development 
process are called Web Forms, XML Web Services, 
ADO,Net, ASR.Net, and Windows Forms. 

For Web development, you use Web formSn 
These are server-side forms that have added 
Web controls. The controls Include the bask 
HTML control (text box, radio buttons, list- 
boxes, and so on) and additional Web controls 
such as a Calendar, an Add Rotator, and a 
Crystal Report Viewer 

You can use all of the Visual Studio 
Languages—C#, Visual C++, and VB—to 
develop Web Forms using the Visual Studio 
IDE. The Web Form becomes a palette within 
the IDE to which you add the visual elements, 
or controls, to create a Web page. By clicking 
on the controls, you can add event code to the 
page and store it In a source file. You also 
have the freedom to add programming behind 
the page In either the Visual Studio compiled 
languages, or one of the script languages, 
such as VBScript, jScript, or JavaScript. Within 
this environment you can incorporate the Web 
Services discussed before into your Web 
Forms, greatly simplifying Web application 
development. The Web Forms page contains 
the visual, XML-based representation of the 
page and a source file with event-handling 
code. The source is compiled into the same 
intermediate code used by the programming 



r "1 


Pros 

Cons 

Familiar IDE, 

Even seasoned 

Simplified Web 

developers will 

integration into 

need to learn a new 

Windows program 

language or learn 

mlng. 

extensive modifica¬ 


tions to an existing 


language. 


language. These files reside on the server. 
When a client selects a page on an A5P.Net 
server, the page transmits to the browser as 
HTML code. HTML or XHTML pages can be 
imported from other sources, such as 
Frontpage, and be converted to Web pages. If 
you're deploying to different form factors and 
client types, Web Forms can also detect each 
type of client and format the output appro¬ 
priately: from WML for phones, to HTML 3.2 
for older browsers, to DHTML for IE 5.5+. 

XML Web Services 

Web Services are services exposed to applica¬ 
tions on the Web that connect to them, XML 
Web Services are typical Web Services, but 
instead of using a proprietary interface, they 
use xml to transmit and receive information. 
Visual Studio makes it extremely easy to 
create Web Services-they’re just another 
kind of project that, when you add the code 
and functions, is deployed on a server. You 
only need to select New Project, then select 
the ASRNet Web Services and name the proj¬ 
ect and you’ve begun. From here, you add 
the logic to support your service (with the 
<WebMethod> descriptor), and build your 
service. 

To test the service within the IDE, press the 
F5 key. Visual studio then builds a page on 
which you can test your services. When you 
enter parameter information, It returns the 
result for you, The IDE also gives you a sample 
schema page that shows you how the use the 
Web Service. This doesn’t deploy the service— 
you still must add the generated pages to your 
.Net enabled sile—but it does make develop¬ 
ment wonderfully easy, 

ADO.Net and ASP.Net 

ADO.Net is the data access component of the 
.Net framework, it provides access to SOL 
Server and OLE DB data sources via an XML 
interface. For Web applications, first it provides 
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Jntcrnct/Web Site Management 

i (31 □ Interner VP/Director/General 
Manager 

! 02 □ Internet/Web Site Manager 

: 03 □ Webmaster 

I 04 □ Web Site Content Developer 
; OS □ Web Site Designer 

fS/IT Professionals 

] 20 □ CTO/C IQ/VP Information Systems 

; 21 □ 15/IT Director/Manager 

j 22 □ Database Manager/Administrator 
| 23 □ Neiwork/Telecommunicalions 

Manager 

; 24 □ Systems Manager 

: 25 □ Directof/Manager Software 

Development 
1 26 □ Analyst 

5 27 □ Programmer 


Corporate Management 

40 □ CEO/Pres i den t/Fart ner/Di rector/Owner 

41 O CFO 

42 □ COO/EVP 

43 □ Systems Integration Consultant 

44 □ Interne! Business Consultant 

Business UnU/Department Management 

60 □ Marketing 

61 □ Sales 

62 □ Hnance/Accounting 

63 □ Human Resources 

64 □ Product Development/R&D 

99 □ Other (Please specify) 


4. What is the primary business of your organization? (Select only one) 


) Internet/Web Industry 

I 01 □ ISP/fnternet Service Provider 
| 02 □ Web Hosting Service 

i 03 □ Web site Design 
; 04 □ InternetAVeb Consultant 

|, 05 □ Internet Hardware Manufacturer 

06 □ internet Software 

Develo pe r/Pub I i she r 
19 □ Other (please specify) 


High Technology Industry 

i 2 0 □ Softwa re Devel oper/Pub 1 1 sber 
21 □ Hardware Manufacturer 
j 22 □ Network Equipment Manufacturer 
23 G Telecommunications 
i 24 C Cable/Satellite Service Provider 
\ 25 □ Consultant 

j 2b □ VAR 

| 27 □ Systems Integrator 

[ 39 □ Other (please specify) 


General Industry 

40 □ MediaCBroadcast/Publisbin^ 

E nterta in ment/Ad vertis in g) 

41 □ Manufacturer (other than computer! 

42 □ Finance/Ban king/Accounting 

43 D Imuran ce/Lega I Services/Real Estate 

44 □ Government (federal/state/local/ 

military) 

45 □ Health Care 

46 □ Wholesale/Retail Trade 

47 O Transportation 

48 □ Education 

49 □ Utilities 

50 □ Hospital ity/Travel 

79 □ Other ip lease specify) 


5, What types of applications are currently implemented or will be 
implemented on your Internet sites? (Select all that apply) 


Billing and/or other Collection Services 

Collaborative Computing 

Site Traffic Analysis 

Comm u n i ty/Disc u ssio n 

Customer Servi ce/Reso I u E i on/1 n q u ir i es 

Database/Document Retrieval 

E-mail Communications 

Education 

Electronic Commerce 
E nterta i nme ni/G a mes 
Technical Support 

I ntern ef Te I ephon yAldeoco n ferenc \ n g 
Marketing 

Content Management/Publishing 

Product/Service Advertising and Promotion 

Research/Education 

Sales Force Automation 

Transaction Processing 

Securily/Encryption 

Streaming Multimedia 

Other (Please specify) 


Gjirenlly 

Within tlw netf 1 

01 □ 

40 O 

02 □ 

41 □ 

03 □ 

42 0 

04 □ 

43 0 

05 0 

44 □ 

06 □ 

45 □ 

07D 

46 □ 

oao 

47 □ 

09 D 

48 □ 

ion 

49 □ 

n □ 

SOD 

i2 n 

51 □ 

130 

52 □ 

140 

53 0 

ISO 

54 □ 

160 

55 □ 

17 0 

56 0 

10 □ 

57 0 

19 0 

58 □ 

20 0 

$?□ 

39 0 

99 □ 


b. In what ways are you engaged in developing and/or managing Internet 

sites? (Select all that apply) 


01 Gl Database Integration 
02 □ Design Graphics 
03 □ Develop Content 
04 □ Direct my Organization's Imemet 
Operations 
05 □ Manage Access 
06 d Manage Data Communications 
07 □ Manage Hardware/Software 
08 □ Manage Site Configuration/System 
09 P Manage System Security 


10 □ Manage Network 

11 □ Manage Outside Web 

Prov idet/Corsu Ita nt 

12 □ Productlon/HTML Conversion 

13 □ Programming 

14 □ Provide Marketing/Business Strategy 
99 □ Olher (Please specify) 


7, How many Internet sites (Internets, Intranets, Extranets) are you 
involved in developing and or managing? (Select only one) 

01 O 1-2 sites 02 d 3-9 sites 03 □ 10-24 sites 04 □ 25-49 sites 05 □ 50 or more sites 


8. Which of the following types of weh sites is your organization (or your 
consulting clients") deploying? (Select all that apply) 


01 P Internet-consumer 
02 □ Internet - business 
03 D Intranet 
04 □ Extra nel 


Web site for public businessHo-consumer purposes 
Web site for public busincssHo'business purposes 
Web site for internal company communications 
Web site for communications among company 
and its suppliers/cl tents 


09 □ None of the above 


9. What is the dollar volume your organization (or your consulting clients' 
combined) will spend in the following categories for Internet/Web site 
implementation in the next twelve monlhs? 

(Select one amount for each category) 

internet 
InfraMrudvre 
Equipment 


Over $10 Million 

08 □ 

55 Million ' $9.9 Million 

07 □ 

$2.5 Million ■ $4.9 Million 

06 □ 

$1 Million - $2.4 Million 

05 0 

$500,000 - $999,999 

04 □ 

$100,000 - $499,999 

03 Q 

$50,000 ■ $99,999 

02 □ 

less than $50,000 

01 □ 


Internet 

Internet 

Internet 

Computers & 

internet Enabled 

.\ctessAXHtjrrg 

Server Hardware 

Software 

Services 

28 □ 

48 n 

68 D 

27 □ 

47 □ 

67 □ 

26 n 

46 □ 

66 □ 

25 □ 

45 n 

6 .5 □ 

24 □ 

44 □ 

64 □ 

23 □ 

43 0 

63 □ 

22 G 

42 □ 

62 n 

21 □ 

41 □ 

6i n 

Continued on back 

► 




















1 0 k How many people are employed in your organization (or your consulting 


clients' combined)? 

Al wxir iLcadion 

Cwnpemv-widfe 

More than 20,000 Employees 

07 □ 

27 □ 

10,000- 19,999 Employees 

06 □ 

26 □ 

5,000 - 9,999 Employees 

os n 

25 □ 

1 j000 - 4,999 Employees 

04 □ 

24 □ 

500 ■ 999 Employees 

03 □ 

23 □ 

100 - 499 Employees 

02 □ 

22 □ 

Less than 100 Employees 

01 o 

21 □ 


II. Do you work for or provide a service to a Fortune T000 company? 

01 □ Ves 02 □ No 


12. Please indicate which of the following describes your functional 
involvement in your organization's acquisition of Interrvel products 
and services. 

(Select all that apply) 

01 D Determine Needs 

02 □ Provide Technicaf Evaluation or Specifications 
03 □ Recommend Purchase 
04 0 Spedfy/Selecl Products, Brands or Vendors 
OS □ Author!ze/Approve Purchases 
06 □ Set Business Goals, Direction, Budget or Standards 
070 Initiate Purchase 
99 □ No Involvement 

13* What is the highest level for which you determine the need, evaluate, recom¬ 
mend, specify, buy or approve Internet hardware, software or access services 
for your organization's tor your consulting clients 1 } Web site/s? 

(Select only one) 

01 □ Several Different 03 □ Several Divisions or OS □ Department 

Companies Sites 06 □ Workgroup 

02 □ Entire Company 04 □ Several Departments 07 □ Myself 


7 _ F PJrD _H E_RET 


PLEASE MAKE SURE YOU'VE: 

■ Signed and dated the form 
1 Filled out the form completely 
» Applied postage 


DO NOT STAPLE — PLEASE TAPE 



PLACE 

STAMP 

HERE 


: 

; 

; 

! 



i 


PO BOX 1246 
SKOKIE IL 60076-8246 


\ 

1 


| 

.Ill 


T4. Which of the following products and services do you yourself determine 
the need for, evaluate, recommend, specify, buy or approve for your 
organization's (or your consulting clients') Web siteA? (Select all that apply) 


Irttemet/Web Software 

01 C. Advertising Server Software 
02 D Groupwar&Goflaboration 
03 □ Database Integration 
04 □ Electronic Commerce 
05 □ Graphic Design 
06 □ HTML Authoring 
07 □ Application Development Tools 
08 □ Push/Netcasiing Technologies 
09 □ Scripting Language 

10 □ Searching/Indexing 

11 □ Security/Fire walls 

12 Q Site Management 

13 □ Statistical Site Analysis 

14 0 Streaming/Multimedia 

15 Q 3D Authoring 

16 D Web Browsers 

17 □ Web Content 

18 □ Encryption 

19 □ Web Server Software 

29 □ Olher InternetAVeb Software 
(please specify)__ 


Other Software 

80 □ Operating Systems 

81 □ LAN 

82 □ Application Servers 

83 □ Application Development Tools 

84 □ E-mail ServersATlients 

85 0 Database Servers/Clients 

89 □ Other Software (please specify) 


Computer Hardware 

30 □ Server Hardware 

31 □ Worksialions 

32 □ UNIX Computers 

33 □ MAC Computers 

34 □ Network Computers 

35 D PC Computers 

36 □ Monitors 
3 7 □ Modems 

38 □ Bridges/Routers/G ate way s/Hubs 

39 □ Digital Imaging 

40 □ Storage/Data Warehousing 

41 □ Uninterruptible Power Supplies 


59 □ Other Hardware (Please specify) 


Services 

60 □ Web Hosting Services 

61 □ Web Design/Devefopment 

Consulting 

62 □ Electronic Commerce Payment 

Services 

63 □ Rented AppJication^ASPs 

69 □ Other Services (please specify) 


Internet Access 


" "i FOLD HERE A 

71 □ Dedicated Leased Lines 

72 □ Fractional T-T/T-1 Services 

73 □ T-3 Services 

74 □ Cable Internet Access 

Services ( CATV) 

75 □ DSL 

76 □ Wireless Remote Access Services : 

77 □ Satellite Access Services 

78 □ Olher Services (please specify) 

99 □ None of the above 


70 D ISDN Services 

15* Please check this box if you wish to be notified via email aboul 
products or services that may be of interesl to you. 

01 □ \ 

'Publisher reserves the right to determine qualification hr FREE subscriptions, 

! 

! 
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Search great escapee by selecting any combination of the following op 


0 From what are you escaping? ^ 

[Unbound 

O' Whit are you escaping to do? 

(unbound 

ju —.. .. 

0 With vhom are you escaping? 

Unbound 

O What Is your damnation? 

|unbound 

0 What IS your maximum price? 

(unbound 

awv.u.w.-w- - '!V i 1 ••• •'— t— ——~ ""fl 

0 When will you leave? (mm/dd/yyyy} 

r— 

© When will you return? (mm/dd/yyyv) 

r — 
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figure 1 


The Visual Studio -Net IDE looks familiar, but each part offers 
more features. 


the schema in XSD format, then transmits data 
in XML datasets. For ASP.Net programming, this 
makes database and server-side programming 
much easier. You can easily incorporate XML 
data from other sources into your applications, 
and you can use XML to transmit data from 
your application to others. ADO.Net is fully 
integrated into Visual Studio .Net and is the 
primary means of data access for the applica¬ 
tions developed with it. 

ASP.Net is where all of the Web Services, 
Web Forms, and programs converge. It’s built 
atop the .Net framework, and as such, you 
can access all of the components there. This 
executes the Web Forms and Web controls 
and transmits them to the client. As the 
successor to the current version of Active 
Server Pages, ASP.Net still supports existing 
ASP pages. However, it’s best used with the 
newer .Net technologies. ASP.Net performs 
much more quickly than the older ASP tech¬ 
nology, and in my testing, it delivered from 
three to ten times the performance improve¬ 
ments. Finally, ASP.Net supports the tradi¬ 
tional cookie method of validating users, as 
well as the new and controversial Passport 
technology. 

Bits and Pieces 

Windows Forms are the next generation of the 
old VB Forms. These can be used and modified 
by any language. You would expect them to 
connect only to the ADO.Net data sources. 

But surprisingly, they too can connect to the 
Internet and use the Web Services I described 
earlier. Windows Forms are also objects as 
part of the new .Net paradigm, and you can 
use the principles of inheritance to create a 
parent form and have other forms inherit its 
characteristics. 

Then there’s the Mobile Internet Toolkit, an 
add-on that fully integrates with Visual Studio 
.Net and lets you build Internet applications for 
mobile devices including the Pocket PC and 
several telephones (like the Nokia 7110). The 
toolkit uses its own forms (Mobile Web Forms 
Controls) and designer (Mobile Internet 
Designer) that work with the Visual Studio .Net 
IDE to provide a drag-and-drop mobile develop¬ 
ment environment. 

An additional new product-a totally unex¬ 
pected one—is Application Center Test (ACT), 
version i.o. ACT is a Web Load Analyzer that 
you can use to stress test your Web applica¬ 
tion in a variety of circumstances, or to simu¬ 
late varying numbers of users. This type of 
product is a must-have, as any serious Web 
developer knows. 


A version of Crystal Reports—reminiscent of 
VB’s early days—is included with the Beta 2 
release, in keeping with a strong Microsoft 
tradition. This version of Crystal Reports lets 
you create dynamic reports that your users can 
access on the Web. 

If you’ve worked with other Web develop¬ 
ment environments, you know that some of 
them are integrated with modeling tools and 
methodologies, like Rational Rose. Visio 2002 
with Enterprise Modeling Tools is included with 
this Beta and it, too, was a pleasant surprise. 
This edition of Visio has been beefed up to 
include not only the modeling and diagram¬ 
ming components, but code generation in Ctt, 
C++, and VB. It also contains Internet Explorer 
6.0 (Beta) and Microsoft Data Access 
Components 2.7. 

Noticeably absent from Beta 2 were Visual 
J++ and Visual FoxPro. As you can surmise from 
the integration of the Web with the Visual 
Studio languages, Visual InterDev is no longer 
necessary. Within VB, some of the previous 
Web development methods, such as DHTML 
Applications and ActiveX Documents, are no 
longer supported either. 

Installation Warnings 

There is one installation caveat: if you’re plan¬ 
ning to run this edition of the Beta, you won’t 
be able to take advantage of its complete 
functionality unless you have SQL Server 


installed. I initially installed VS Beta 2 on a 
clean (read: newly installed) instance of 
Windows 2000 Advanced Server, on which I 
hadn’t yet installed SQL Server. Not only was I 
unable to load some of the Beta features, but 
( had some unexpected problems that I 
suspect were related to its absence. I reloaded 
the OS, loaded SQL Server 2000 onto my 
system, then loaded Visual Studio .Net. Every¬ 
thing worked just fine. 

Two Thumbs Up 

Visual Studio .Net is truly a tour de force for 
Internet and intranet development. It is Web¬ 
centric and, if you’ll pardon my use of the 
term, Webilicious. The integration of Web 
Forms with C# and VB is complete, and has 
erased the boundaries between traditional 
Windows programming and Web program¬ 
ming. There’s so much here that it will take 
developers awhile to discover it all. There is 
certainly a new alphabet soup to be digested. 
But get out your spoon, dig in, and enjoy. It’s 
worth the effort. 

-John Pearson 

John programs in VB ond C++ for the Church of 
Jesus Christ of Latter-Day So/nts, does private 
Web deve/opment, and contributes to magazines 
such as Visual C++ Developer’s Journal., and 
JavaPro. 
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SERVER 



Give your Web server some relief by creating your own dedicated log 
analysis appliance. 




ing Appliance 


A 


Jim Jagielski 

_ J 


online 

resources 

Starting points for 
creating your new, 
dedicated log server. 



As founder of an ISP and Web hosting company, as well as 

someone who's done a lot of consulting, I’ve witnessed 
firsthand how companies have begun to embrace network 
appliances. I can’t think of any enterprise-level setup that 
doesn’t contain a load balancer (something that I consider 
an appliance), a network cache, and at least a data storage 
appliance or two. 

The reasons are obvious. By segmenting different require¬ 
ments to dedicated equipment, you avoid interdependencies 
that can adversely affect performance and reliability. There is 
one appliance that makes logical sense and yet (as far as I 
know), doesn’t exist: the log analyzer appliance. 

Businesses often have an outside company perform their 
log analysis for them. To be fair, specialized firms that do 
detailed data mining of Web log files are important, and 
provide valuable insight. But with the log analysis programs 
available today, you can perform the more mundane analy¬ 
sis in-house, saving the expense of hiring outside firms for 
the really detailed and involved mining. 

I have issues with running log analysis tools on the 
Web server itself. As frequent readers may know. I’m a 
proponent of dedicating as many server resources to the 
Web server as possible, for performance and security 
reasons. Log analysis programs, by their very nature, tend 
to consume lots of memory and CPU horsepower—it’s 
hard work. 

It makes no sense to tune the hardware and operating 
system for peak Web performance if you want it to perform 
the intensive analysis work as well. This is especially true if 
you want on-demand analysis. I can’t tell you how many 
clients have told me that they like checking the logs every 
15 minutes, but can’t understand why their site is so slow. 

I usually recommend a better solution: creating your own 
log analyzer appliance. Doing so not only takes the load off 


logging time 


Analog 

www.statslab.cam.ac.uk/~sreti/analog 


NetTracker 


www.sane.com 


Terra Soft Yellow Dog Linux 

www.yellowdoglinux.com 


OpenSSH 

www.openssh.org 


BIND/named 

www.isc.org/products/BIN D 


Web Trends 

www.webtrends.com 

L_ 

_ A 


of the Web server, which is always a good thing, but it also 
provides a central location for all of your log analysis needs. 

Build It 

We shouldn’t get too caught up with the term “appliance" 
at this point. Obviously, building your own standalone box 
isn’t as easy as plugging in a toaster. Instead, we’re using 
appliance as a descriptive term-a device specifically 
designed for a single purpose. It doesn’t have to be a lU 
box, for example, with only an Ethernet port and no other 
connectivity. Any old Pentium 2 system will work fine. 

As with most computer tasks, a fast CPU is better than 
a slow one, but because we aren’t requiring real-time, on- 
the-fly analysis reports, this isn’t too much of a concern. 
Your hardware should have at least 256MB of RAM, as 
analyzer software typically makes extensive use of in- 
memory caching. You also need disk space for temporary 
files, but unless you plan to store archived files locally, a 
few gigabytes of available disk space are more than 
adequate. 

On the operating system side, again, there are no real 
requirements that would force you to choose one OS over 
another. Although it used to be true that most of the really 
good log analyzer software packages available were NT only, 
that’s no longer the case. For security and efficient use of 
resources, I prefer BSD and Linux systems, with FreeBSD as 
my top choice. 

Log files can be pretty sizable beasts, thus your configu¬ 
ration should be able to handle large file sizes and have 
plenty of available file descriptors—to handle numerous 
open files at once. This requires very few standard Unix 
background processes, or daemons, so you can disable 
daemons such as sendmail and identd. 

One process that you definitely do want running on the 
appliance is the BIND Domain Name Service daemon. For 
performance reasons, your Web server shouldn’t perform 
any DNS name lookups; instead, it should log all client 
requests by IP address. The log appliance can determine the 
hostnames for all of those IP addresses later, by performing 
the reverse DNS lookups itself during the analysis process. 
However, having the appliance use a remote DNS server 
adds overhead and latency to each lookup. Instead, you 
should install the latest implementation of BIND, and 
configure it as a local caching server, using the named.boot 
file shown in listing 1. 

Roll With it 

With the hardware selected, we’re ready to create our 
appliance. But before we start the actual design and 
layout, it’s always best to have a clear and detailed 
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// We only allow local requests. Could add 
specific IP address as well 
acl "allowme” < 127.0.0,1; ); 
options ( 

// the working directory 
directory Vusn/lDcal/etc/namedb"; 
pidhfile "named.pid"; 
allow-query { "allowme"; }; 

// root server hint file 
zone i 
type hint; 
file "root.hints"; 

>5 

// loopback 

zone r, 0.0.127.in-addr.arpa' 1 { 
type master; 
file "localhost.rev fl ; 
notify no; 

>* 


method for creating, rolling, and managing the 
various log files your Web server produces. 

The first order of business is to make sure 
that your server is logging the exact informa¬ 
tion you need. By default, most Web servers 
create logfiles that follow the Common format. 
This format logs a single line of output per 
request, which includes the client’s IP address; 
authentication information where it’s appropri¬ 
ate; date and time; the actual HTTP request 
received; the HTTP response code; and the 
actual size of the response in bytes. 

You can receive a lot more detail regarding 
the traffic and usage patterns for your site if you 
opt for the Combined format. This format adds 
two additional data fields to the common for¬ 
mat: Referer and User-Agent. The Referer field 
contains the URL of the link that referred the 
user to this page. That information is useful for 
checking out how users arrive at your site, and 
how they move around while visiting. The user- 
agent field notes the user’s browser type. 

The method of specifying this format varies 
between Web servers. With Apache, you would 
add the following lines to your httpd.conf file 
and restart your server: 

LogFormat "%h %L %u %t \"%r\" 
%>s %b \"%{REferer)i\" \"%CUser- 
agentHV" combined 

CustomLog logs/accessJLog 
combined 

The LogFormat directive creates a combined 
format type, which adds the Referer and User- 
agent HTTP environment variables to the 
logged data. 

Our next hurdle is to determine how we 
want to roll the logs. It’s a very bad idea for log 


analyzers to work on active log files, so you 
need a technique for closing and storing the 
current log file while opening the new one, and 
ensuring that no data is lost during the transi¬ 
tion. You might assume that some simple Unix 
commands would be enough to make a copy of 
the active log file and then truncate it: 

% cp access_log stored_log 
% cp /dev/null access_lDg 

The problem, however, is that during the 
period of time between when the copying 
finishes and when the file is truncated, your 
Web server may be adding additional data 
to the log file—data you’ve just deleted! Moving 
the log file with the Unix mv command won’t 
work either, because the Web server is still 
attached to the file, no matter what it’s called. 

Apache comes with a script that solves this 
problem. Alternatively, you could use its reli¬ 
able piped logging capability, which instructs 
Apache to send the log data not to a file 
directly, but to a Unix process. 

No matter which method you use, next you 
must make sure that the analyzer appliance 
has access to the rolled log file. There are two 
typical ways of handling this. The first s to 
copy the file to the appliance. With this tech¬ 
nique, you use OpenSSH’s scp command to 


securely copy the log file from the Web server 
to the appliance. 

The second technique is to copy the file to 
NFS storage. This method uses an NFS mounted 
storage device as the repository for all stored 
log files. The Web server copies the logged files 
to this location and the analyzer appliance 
reads the file from there. With this approach, 
you don’t need all that much local disk storage 
for the analyzer. 

I tend to use the latter approach. I like the 
infinite storage capability and the fact that it 
insulates the analyzer appliance from the stor¬ 
age headaches associated with archiving large 
log files. This way, I can supplement storage 
and not affect the appliance at all. 


Decisions, Decisions 

The next step is picking the right analyzer 
package. In my opinion, there are two products 
at the forefront: Analog, the popular open- 
source analyzer, and NetTracker, by Sane Solu¬ 
tions, Sure, there are others, but with these 
packages I can find out at least 95 percent of 
what I need to know about Web traffic. They 
complement one other, but their features over¬ 
lap as well. The best solution is to use both; 
but if you can’t or won’t, then the following 
breakdown of each package can help guide 
your decision: 

Analog, Analog excels as a Web server traffic 
analyzer. Its strength is in displaying the 
patterns, bandwidth, and usage requirements 
of each particular site, as reflected in the log 
file report. The reports themselves are pretty 
basic, but a few helper applications work with 
Analog to create more visually appealing 
reports. Analog is open source, which is a 
significant factor for some people, myself 
included. This means it’s available in source 
code format. There are pre-built binaries avail¬ 
able for various systems as well, including 
Windows NT and the Mac OS. 

Two things really set Analog apart. First, it’s 
fast. Very fast. This is due to the software 
design—Analog’s code is lean and streamlined- 


but the technology behind it is also responsi¬ 
ble. Analog takes full advantage of caching, 
especially in reverse DNS lookups. It also lets 
you use platform-specific reverse DNS tools, 
which enhance performance even more. 

Another important factor is that Analog can 
be completely controlled in a command-line 
environment. You can script exactly what 
needs to be performed, and then create a cron 
job to run the script at set times. For someone 
like me, who cut his or her teeth on Unix 
System 3, command-line control is very impor¬ 
tant. It also means that Analog fits in nicely 
with the whole appliance idea. It’s easy to 
create scripts that run automatically, making 
the device more truly plug and play. 


» 


It makes no sense to tune the 

hardware and operating system for peak 
Web performance if you want the log analy 
sis program to perform the intensive 
analysis work as well. 
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As good as Analog is, it comes up short on 
one aspect of log file analysis: user session 
analysis. For things like how long a user spent 
browsing your site, or the clickpaths he or 
she followed, l find that only commercial 
offerings provide the level of detail you’ll 
want as a Webmaster, as well as the kind of 

I 

drilling down analysis that your clients 
expect. The fact that you can generate attrac¬ 
tive reports directly from these packages is 
icing on the cake. 

NetTracker. NetTracker is a commercial log 
analyzer I’ve recently added to my arsenal. I’m 
still a big fan of WebTrends, but except for a 
handful of expensive packages, WebTrends 
software is available only on NT. NetTracker 
works on a much wider range of platforms. 

It’s just as easy to use, and provides the 
detailed user-session information I need. 

NetTracker’s advertising message is curi¬ 
ous, however. Ads boast that you can run the 
product on the Web server itself to avoid 
copying and moving the log files. They also 
mention its Web-based control panel. These 
don’t seem like selling points to me. In fact, i 
almost didn’t even bother trying NetTracker at 
all. But as it turns out, the product is perfectly 
happy running on a separate dedicated server 
{using copied log files), and has an extremely 
robust command line interface as well. 

End of Log 

By installing both Analog and NetTracker on 
my dedicated analyzer appliance, I’ve added a 
robust tool to my ever-increasing bag of 
tricks. Now I have a central location for run- 
ning log analyses for all of my sites and ser¬ 
vers without adversely affecting my systems’ 
performance. A single individual can create 
and organize log reports for several clients. 
Clients only need access to one central log 
appliance. 

1 especially like how this tool lets you easily 
scale your Web infrastructure. Because you can 
support numerous Web sites and servers with 
a single log appliance, you also realize true cost 
savings. Even if you don’t consider the license 
fee for software, by offloading the analysis to a 
dedicated appliance, you gain additional horse¬ 
power for your Web server without undergoing 
costly performance upgrades. >< 

jim has been active on the Net and the Web since 
the late ' 80 s, He’s currently Senior Consultant for 
Covalent Technologies. Vou con reach him at 
jirn(a)jaguNET.com. 
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As wireless Internet technology becomes cheaper, increasingly more 

portable appliances will become Internet enabled. With an Internet- 
enabled car radio, for example, listeners could order the CD or MP3 file 
of a song they just heard by pressing a button. CPS mapping systems 
would also benefit from Internet connections, which would allow them 
access to updated traffic and weather information. 

Though these types of consumer devices are exciting, industry will be 
the greatest beneficiary of these technologies. For example, UPS and 
FedEx truck drivers already use roving appliances to obtain electronic 
signatures from delivery recipients. The appliances’ network connectiv¬ 
ity helps make package tracking accurate and timely. {See Amber 
Howie’s “Recipes for Network Appliances” in this issue.) 

It has been predicted that businesses will spend billions of dollars 
developing other roving Internet appliances in the coming years. Some 
of these appliances will have displays that can be used as Web browsers; 
others will not. Rather, the software that’s running in each appliance 
will make use of Internet services transparently to transfer information 
to and from conventional servers. 

online 
resources 

These sites will get 
your business moving. 


V 

x. 


resources to go 


Cloudscape 

www.cloudscape.com 

Sieepycat Software 

www.sleepycat.com 

Sybase SQL Anywhere 

www.sybase.com 

Fast Objects by Poet 

www.fastobjects.com 

Pointbase 

www.pointbase.com 

UDDI 

www.uddi.org 

AvantCo 

www.avantgo.com 




For effective delivery of such next-generation devices, several chal¬ 
lenges will need to be overcome. Roving Internet appliances are a special 
class of network appliances because they’re constantly moving around. 
They must continually adapt to the environments in which they find 
themselves whenever they change locations. 

In the past, companies like UPS were forced to invent their own tech¬ 
nologies to address mobility issues. Fortunately, vendors are beginning 
to introduce new solutions based on existing Internet technologies and 
methods that will give the next generation of devices a running start. 

Self-Service Applications 

Network connectivity is the first issue to address. For a roving network 
appliance, an Internet connection can be intermittent, unreliable, and 
costly. The primary challenge that developers face is minimizing feature 
degradation, even when the device doesn’t have access to the Internet. 

One way to achieve this goal is to arm mobile appliances with more 
powerful transmitters so that they can remain in contact with centra! 
servers across a broader area. It’s unrealistic to expect 100 percent 
coverage, however. And as the power of broadcast equipment increases, 
so, typically, does its bulk. There’s a point at which the appliance 
becomes too cumbersome to be effective. 

A better solution is to actually place scaled-down versions of server 
functionality inside the roving appliance. In essence, when the Internet 
connection is down, the roving Internet appliance can serve itself. 

One example would be an embedded SQL database inside the appli¬ 
ance. Usually, you think of databases as existing on servers and not on 
clients. However, in the case of a roving appliance with an intermit¬ 
tent Internet connection, a database inside the client could stand in 
for the server. 

As an example, let’s consider a possible application for the shipping 
industry. Every time the truck driver picks up or delivers a shipment, the 
shipping company’s main computer should be notified for tracking 
purposes. But if a truck’s roving Internet appliance can’t connect to the 
network, immediate notification isn’t possible. 

Repeated attempts to connect to the network for a single delivery 
would tie up the appliance’s resources. It would be better to have the 
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ROVING INTERNET APPLIANCES 


You wouldn’t want to place a 

complete copy of Oracle or DB2 on a truck’s 
roving Internet appliance. Instead, you’d 
use a database that’s designed for small- 
scale Internet devices. 


appliance record the information in its own 
interna! database and schedule another notifi¬ 
cation attempt at a later time. When the 
Internet connection can be re-established or 
the truck is at a depot, the appliance’s data¬ 
base could then be automatically synchronized 
with the main server’s database. 

Pocket-Sized Databases 

Obviously, you wouldn’t want to place a 
complete copy of Oracle or DB2 on a truck’s 
roving Internet appliance. Instead, you’d use 
a database that’s designed for small-scale, thin Internet devices. 

One such database is Cloudscape (see the ‘‘Online Resources" box for 
more information), which is made by Informix, now an IBM subsidiary. 
You can download the developer's version of Cloudscape from the Web 
site, free of charge. 

Cloudscape is written in Java, and is compliant with SOL92. Its foot¬ 
print is about 2MB, making it compact enough to embed in many small 
Internet appliances. It’s easy to access Cloudscape from a Java program 
using the standard Java JDBC library and SQL. With a little bit more 
setup, you can access it from non-Java programs using an ODBC driver. 

Other embeddable databases include Berkeley DB, an open-source 
product from Sleepycat Software; Sybase SQL Anywhere; FastObjects by 
Poet; and Pointbase {see the "Online Resources” box for more informa¬ 
tion on these products). Still others are likely to appear as the market 
for lightweight appliances grows. 

During the day, our hypothetical roving Internet appliance for truckers 
could directly access the main server’s database using SOL, so long as it 
had an Internet connection. When an Internet connection wasn’t available, 
the appliance could still use the same SOL code to store data on its local 
embedded database. Later, when the truck has access to the main server 
again, the contents of the local database and the main server’s database 
could be synchronized, completing the delivery notification. 

Staying in Sync 

To illustrate, let’s suppose the shipping company’s main database 
contains the tables: Customer, Shipment, and ShipmentProgress, totaling 
tens of thousands of rows. The entity relationship diagram for these 
tables is shown in Figure 1. 

The customer table contains one row per customer. The shipment 
table contains one row per shipment, and this is joined to the cust¬ 
omer table. The shipment progress table contains one row for each 
significant, trackable event during the shipping process. For example, 
it may contain a row for when the shipment is initially scanned, other 
rows for each time the item is loaded or unloaded from a truck, and a 
final one for when the shipment is delivered. 

Only a small subset of the rows will be relevant to any particular 
truck on any particular day. At the beginning of a trucking run, the 


embedded database would download the section of the server’s primary 
database containing only those rows that are likely to be relevant to the 
truck’s run on that day. 

At the end of the day, the server database would reload all of the data 
from the embedded databases aboard each truck. Most of it would be 
unchanged, as the preponderance of the rows—for example, company 
information—will seldom be modified by the roving Internet appliance 
software. Al! rows that were modified or inserted into the embedded 
database would be updated in the server’s database (as in Figure 2). 

Update conflicts are unlikely, because each truck is involved in 
different shipments. Fiowever, in rare cases where conflicts do arise, 
they’ll have to be resolved using either smart software or manual inter¬ 
vention. For example, if two trucks delivered shipments to the same 
company, and both changed the telephone number of the company to 
a different, inconsistent telephone number, then the resolution would 
probably require manual intervention. 

In other cases, program logic could determine which data is more 
accurate. Cloudscape, for example, contains a special module, called 
Cloudsync, to help in this reintegration process. 

Getting Connected With UDDI 

Another challenge for roving Internet appliances is that users will 
want to connect to different servers depending on their location. For 
example, truck drivers may need to electronically contact customers 
to ask for the location of a loading dock. 

To do this, the roving Internet appliance may have to "discover” the 
URL of the company receiving the delivery, and determine the commu¬ 
nications protocol of that company’s computer, in real-time. A new set 
of proposed Internet standards called UDDI (see “Online Resources”) 
allow this to happen. UDDI stands for Universal Description, 

Discovery, and Integration, and is actively supported by Microsoft, 

IBM, and many other companies. UDDI lets devices on the Internet 
find each other based on the set of services they provide. 

UDDI is bu It on top of SOAP (Simple Object Access Protocol), an 
XML dialect that lets programs directly invoke functions inside each 
other over the Internet. When computers use SOAP to talk to one 
another, the data, objects, or function calls transferred between them 
are all represented in an easily processed, XML-based format. 


Company Shipment Shipment Progress 


figure 1 


The tables of the server that would be partially 
copied to the embedded client database. 


How UDDI Works 

To take advantage of UDDI, a given industry must generate standard¬ 
ized descript ons of the interfaces to any online services that it 
intends to support. These descriptions are written in the Web Services 
Description Language (WSDL), which is conceptually similar to an IDL 
file in CORBA and COM, or an interface file in Java. The WSDL files are 
then published in a globally accessible UDDI server. 
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figure 2 


The roving internet appliance can use either its own embedded database or the server's database. 


Each company within the industry that supports a particular 
WSDL interface must then publish that fact, together with a URL 
for users to access its particular implementation of that service, on 
a UDDI server. IBM and Microsoft host the global server, which is 
freely available to any company that wishes to publish WSDL interfaces. 
In other situations, a private server located behind an enterprise firewall 
may be preferable. 

In addition to a URL, companies can also publish geographical or 
other information with their entries. This supplementary identifying 
data can help users of a service distinguish one company from others 
providing the same service. 

In Figure 3, you can see the way it would work in our trucking 
industry example. The industry might publish the description of an 
interface called TruckDocklnformation that contains one function, 
called TellMeWhichDock. Companies that have loading docks would 
implement this interface. Trucks could then query the UDD! server for 
the URL of the TruckDocklnformation service at the particular company 
accepting a shipment. Once found, each truck’s roving Internet appli¬ 
ance could access the service, returning a dock number that indicates 
where the truck should unload. 

One of UDDI’s benefits is that the company providing a service 
doesn’t have to write its software in the same computer language as 


Trucker's Roving Receiving Company UDDI 

Internet Appliance Loading Dock Server 

Computer 


When 

Query for URL of company’s 

TruckDocklnformation interface 

getting 
close to 

URL 


a client’s 

loading 

dock. 

TellMeWhichDock ? 



Dock Number 



3 The appliance queries the loading dock 
computer to find out where the truck 
should unload. 



the company using the service. For example, the software in the truck’s 
roving Internet appliance could be written in Java, while the code in the 
machines at the company receiving the shipment could be written in 
Visual Basic. 

Another advantage is that the company receiving the service can 
change which computer is providing the service. As long as the 
company publishes the new URL in the UDDI server, the client will 
always be able to access the service’s server by asking the UDDI server 
for the new URL. 

The Road Ahead 

As the market for roving Internet appliances continues to mature, more 
tools will become available to make deployment of applications like 
our trucking example even easier. Currently, much of the infrastructure 
must still be developed and integrated in-house, but vendors are 
beginning to offer packages that will speed the time to deployment 
even further. 

For example, AvantCo (see "Online Resources’’), which has long part¬ 
nered with larger companies to develop custom mobile applications, has 
recently begun to offer packaged application solutions as well. These 
products provide a cost-effective means of addressing the synchroniza¬ 
tion issues mentioned earlier, while drawing from AvantGo’s years of 
expertise in the mobile appliance market. 

Even if your organization chooses to go it alone, recent advances 
in Web development have brought widespread deployment of roving 
Internet appliances much closer to reality. For example, we’ve seen how 
UDDI, while it was developed with Web services in mind, is equally 
applicable to mobile appliances. 

As these developments become more widely adopted, what were 
once Herculean efforts—possible only for large companies like UPS- 
will become feasible for smaller organizations as well. Given the 
benefits, the question may not be whether you’ll need to investigate 
roving appliances for your business, but how soon. >< 

Michael is the founder of Eloquence, a training and consulting firm. He 
writes courses and del/vers training worldwide on topics such as the Web 
and java programming. Vou can reach him at michael(a)eloquence.com. 
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Intrinsyc CerfCube 

Intrinsyc 

www.intrinsyc.com 

$379 


Web Server Building Block 

How small can a Web server be? We all seem 

to get a kick out of Web servers installed in 
matchbox-size cases, but wonder what they 
can really do. Case in point: I took the soda-can 
size CerfCube to work and showed it to my co- 
workers (all embedded-software developers). I 
was disappointed by their initial reactions, 
“Cute, but what can it do?" 

While the diminutive CerfCube might draw 
attention, it has much more to offer than just 
its size. To fully appreciate the CerfCube, you 
have to get creative. For example, you could 
build a portable network intrusion detection 
system around it, which is configurable via its 
Web server, Another idea is to build a dial-on- 
demand/network firewall using an external 
modem from CerfCube’s serial port. This uses 
only 2.5 watts (5 watts if you add a Compact 
Flash card), so it’s ideal for applications that 
require appliances to be on continuously or to 
run on battery power. 

Inside the CerfCube 

The CerfCube is built into a three-inch cast 
aluminum cube. Inside are a 192 MHz 
StrongArm processor, 32MB of RAM, and 16MB 
of Flash memory. On the back you’ll find a 10T 
Ethernet, a serial port, and USB connectors. 
The cube comes pre-installed with Linux or 
Windows CE; I tested the Linux version. I 
hooked it up to my network, connected the 
provided serial cable to my PC, started a 
terminal program, and plugged in the power 
supply. The CerfCube booted in 20 seconds 
and presented a Linux login prompt in the 
terminal window. I then logged in and set the 
CerfCube's IP address. The software loaded on 
the box includes basic Unix command-line 
utilities (most of them are part of the Busy- 
box open-source package) and the Apache 
Web server. 

The CerfBox is based on a 2.2-inch by 2.7- 
inch single board computer called the 
CerfBoard. In addition to the I/O ports 
mentioned above, the CerfBoard also has 
connectors, an LCD panel, two more serial 
ports, a CODEC for audio I/O, and 16 general 
purpose i /0 lines. To access the Compact Flash 


socket, you simply turn the cube upside 
down—there’s no bottom to it. To access the 
other features, you must remove the board 
from the cube and add more connectors. 

The CerfBoard has a Compact Flash socket. 
This isn’t as common as the PCMCIA interface, 
but it’s smaller, and there’s still a reasonably 
good selection of cards ranging from micro 
hard drives to network cards available far it. 

Intrinsyc markets the CerfCube as a refer¬ 
ence platform to help you develop your own 
appliances, it isn’t intended to be an out-of- 
the-box appliance like a Cobalt or Ceiestix 
server. It comes pre-loaded to help you get 
started more quickly, not to make it ready to 
use. If you accept this, then the cube can be 
both feature-rich and cost-effective. 

For more advanced product development, 
you can buy a sister product that’s also based 
on the CerfBoard. called the CerfPod. This 
includes a 5.7-inch touch-sensitive LCD display 
and has all of the Cerf Board’s features brought 
out to external connectors via a separate 
break out board. The CerfPod seems pricey at 
$3795. hut that price includes a developer 
support contract. Likewise, you can get a 
CerfBoard OEM developer’s kit with support 
contract and break out board for $2795. The 
CerfBoard by itself is $329. 

Software Development 

The StrongArm processor has been around for a 
few years, and a good Linux port is available for 
it. The StrongArm has a built-in memory 
management unit (MMU), which makes it rela¬ 
tively easy to implement shared libraries and 
multitasking in comparison to its MMU -less 
cousin, the Arm 7, or many other small, embed¬ 
dable processors. Shared libraries let you cram 
a lot more code into the 16MB flash memory, 
and make porting programs from standard 
Linux systems much easier. 

The CerfCube comes with a CD-ROM that 
includes a development environment and a 
halfway decent set of instructions on how to 
set it up. To work with it, you install the envi¬ 
ronment on a Linux system. The documenta¬ 
tion calls for Red Hat 6.2, but other recent 
distributions would work as well. You can 
then cross-compile programs on the develop¬ 
ment system to run on the CerfCube. I tried 
this out by building a “Hello, world" C pro¬ 
gram, and used the FTP on the CerfCube to 
load the binary from my development server. 
It ran just fine. 



Pros 

Small, very low 
power requirements, 
good hardware 
feature set. 

■ i 

Cons 

GDB and USB 
support are missing. 


I said the instructions are halfway decent 
because the CerfCube bootloader didn’t work 
as described in the development manual. To 
permanently store a binary on the CerfCube 
RAM disk, you must build a new disk image on 
the development server, and then transfer it 
into the flash device. 

At boot time, the Intrinsyc bootioader copies 
two memory areas from the flash device into 
RAM; these are the kernel and the RAM disk. 
Normally, the bootloader then jumps into the 
kernel code and starts up Linux. When you hit 
Enter during startup, it drops into the boot- 
loader command line interface instead. At that 
point, you can copy the new RAM disk file from 
the development system directly to the flash 
device using a Trivial FTP (TFTP) command. 

You have to run a Bootstrap Protocol 
(BOOTP) server and a TFTP server on your devel¬ 
opment system; the manual covers this. You’ll 
probably have to install the TFTP RPM package 
on your server from the Red Hat CD—it isn’t 
installed by default. The BOOTP protocol is a 
subset of DHCP, so you have to install the 
DHCPD package if you aren’t already running 
a DHCP server. 

The development environment has GNU 
tools including C and C++ cross-compilers for 
Arm, and source packages for the installed 
software and other packages you might want. 
There are sources for Perl and for server 
daemons for DHCP, sendmail. telnet, and FTP. 
Because the developer kit is included with the 
CerfCube at no extra charge, it’s a bargain, but 
two elements are missing. The first is the GNU 
debugger, or GDB. The second is a driver for the 
USB port—writing one yourself would be a big 
task and Intrinsyc really should include this 
since it put a jack on the box. 

For my projects, a single board computer 
needs to be powerful enough that it’s easy to 
port software to it. It should also be small, effi¬ 
cient, and relatively affordable. The CerfCube 
seems to meet all of these needs. 

I hope that a good developer community 
forms to support the CerfCube, as it has 
around the similar Lineo uCsimm card. Though 
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I encountered some serious glitches in the RAM 
disk procedure, I think these will be ironed out 
in future releases. The combination of hard- 
ware and software in this product still makes it 
very appealing, 

-Brian Wi/son 

Brian is cofounder of Harbro Systems in Santa 
Rosa, CA. You can send him your comments at 
bwilson(a)harbrosystems.com. 



Combination Set 

U.S. Robotics 

www.usr.com 

$289.95 

Easy WAN 

if you’ve ever experienced the joy of surfing the 

Web while sitting naked in your lawn chair, then 
you’ll probably give up wireless networking only 
when they pry it out of your cold, dead hands. 
Wireless technology can profoundly affect your 
lifestyle—in a way that quickly makes the tech¬ 
nology a necessity. However, to enjoy wireless 
computing, you must know how to setup a wire¬ 
less network. While there are many wireless 
products that target the consumer market, 
hardly any are safe with a novice computer user. 

U.S. Robotics attempts to address this 
usability issue with its Wireless Access Network 
(WAN) Combination Set, which comes with 
software designed to allow any user to turn his 
or her Windows desktop into an Internet shar¬ 
ing device, a.k.a. NAT. While U.S. Robotics gets 
a few things right, this still isn't the break¬ 
through that will bring wireless to the novice 
user. Whoever installs this WAN still needs to 
know the difference between a WAN and a LAN 
(and let’s not kid ourselves about the number 
of people who actually do). 

On the other hand, this solution will please 
any savvy home user or network administrator 
looking for a small office/home office (SOHO) 
wireless network solution. Of course, this 
person will also have little use for the bundled 
software, and can probably safely toss it aside, 
with the exception of the actual driver. The 
good news is that this also works with Linux, 
although it’s impossible to learn that by read¬ 
ing the documentation. 

The driver installation is pretty uneventful, 
as it should be. Following the driver installa¬ 
tion, a screen appears with options for instal¬ 
ling configuration utilities for the wireless 
card and access point. The wireless card 


configuration tool includes options for WEP 
encryption (including WEP key generation), 
hard limits on bandwidth, and the wireless 
network mode (access point or peer-to-peer). 
First check the Configuration tab. Various 
mode options include Infrastructure, Ad Hoc, 
and 802.11 Ad Hoc. infrastructure is for using an 
access point and Ad Hoc means peer-to-peer. If 
you want your wireless network to access any 
outside networks, such as the Internet, then 
choose Infrastructure. 

The next option is SSID, which is supposed 
to let you choose between Basic Service Set 
(BSS) and Extended Service Set (ESS). BSS is for 
a wireless network with one access point, while 
ESS lets wireless clients roam a wireless 
network with multiple access points. 

The next configuration option is for rate- 
limiting bandwidth, in case network perform¬ 
ance is poor. Further options let you set the 
encryption mode (64-bit, 128-bit, or Disabled), 
power saving mode, or wireless channel, 
although the last option can usually be left as 
Default. The same utility has a Link Info tab that 
lets you monitor your network performance, 
including transmission rate, throughput, link 
quality, and signal strength. I enjoyed having a 
screen that told me whether the card was actu¬ 
ally working. And given a larger wireless network 
with multiple clients, this could prove quite 
handy. Finally, the configuration utility includes 
an Encryption tab from which a key can be 
generated with a user-supplied pass phrase. 

The second piece of software you can install 
is the Wireless Local Area Network (WLAN) AP 
Utility for the access point. This is a simple, 
intuitive, and handy utility that lets an admin¬ 
istrator scan the wireless network and detect 
which clients are aboard. It lists the machine 
name. MAC address, and IP address. 

The last step in the included documentation 
outlines how to configure Windows as a router. 
Novice users won’t be able to glean enough 
information from the instructions to be off and 
running in minutes, which seems odd because 
the product dearly targets the consumer and 
SOHO markets. More helpful documentation is 
included in the PDFs on the CD-ROM, and i highly 
recommend that networking newbies check 
these before delving into the installation process. 

Performance 

For the most part, performance was pretty 
good. However, the documentation lists 
Windows 95 and Windows 98 FE as partially 
supported and Windows 98 SE as fully sup¬ 
ported. Of course, Windows 95 and Windows 98 
FE don't come with routing capabilities, and 
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the included documentation contains a Web 
site where you can download a software hack 
(WinRoute Lite) that installs LAN/WAN capa¬ 
bilities. This provides extremely subpar perfor¬ 
mance, is quite buggy, and frankly, isn’t 
worth the time and effort it takes to use it. If 
you’ve used this and managed to set up a 
DHCP server that works, I can only guess at 
how you accomplished it. There’s no routing 
problem with Windows 98 SE, but! would still 
recommend at least Windows NT 4 for an 
access point using the included U.S. Robotics 
software. If you’re using Linux as an access 
point, any version will include routing and 
Internet sharing capability. 

Limited Drivers 

U.S. Robotics earns big demerits for not includ¬ 
ing valuable information for those of us who 
don’t use Windows. There was absolutely noth¬ 
ing in the documentation to indicate that the 
chip set in the PCMCIA card is based on the 
same WaveLAN chip set from Lucent that’s 
used in many other wireless cards. Moreover, 
the U.S. Robotics Web site contains no infor¬ 
mation or links regarding Linux drivers or any 
other non-Windows operating system. Luckily 
my Linux laptop and desktop already had the 
WaveLan module installed, so I didn't have to 
dig for drivers. Getting the Peripheral Compo¬ 
nent Interconnect (PCI)-to-PCMCIA adapter for 
my desktop to work was another matter. While 
the details of that exercise are best left for 
another article, here’s a helpful URL: 
oreilly.wirelessdevnet.eom/pub/a/wireless/ 
2001/03/06/recipe.html. 

In summary, this is a decent entry point for 
U.S. Robotics in this market, and if it solves its 
documentation and cross-platform issues, the 
WAN Combination Set wiii be an even more 
attractive addition to any network administra¬ 
tor’s toolbox. 

—John Mark Walker 

John Mark works as foundry manager for 
5 ourceForge.net, a division ofVALinux Systems. 
You can reach him at jmwalker(a)vaiinuxxom. 
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Photoshop: Restoration & 
Retouching 

By Katrin Eismann 
Que, 2001, 276pp. 

$ 49-99 

When I was born, my parents fantasized that 

I would grow up to become a great artist. My 
inability to draw a straight line, however, 

quickly ended that 
dream. All the same, 

I was excited to 
receive my first 
painting software 15 
years ago. The box 
showed wonderfully 
intricate pictures 
that people had 
drawn using the soft- 
ware. Unfortunately, my hopes for similar 
results were quickly dashed when I discovered 
that I could draw a straight line with the soft¬ 
ware, but not much else. 

My artistic clumsiness extends to cameras 
as well. Despite the best efforts of camera 
manufacturers to develop true point-and-click 
devices, I still manage to muck up the occa¬ 
sional snapshot. Theoretically, I could use the 
computer beneath my desk to correct an 
underexposure or remove an awkward blemish. 
However, doing so would mean using 
Photoshop or something similar, which brings 
me back to where I started, 

Hope sprang anew when I received Katrin 
Eismann’s Photoshop Restoration & Retouching 
in the mail, and this time my dreams were ful¬ 
filled. Eismann’s book is a step-by-step, illus¬ 
trated primer that explains how to retouch 
digital photographs using Adobe Photoshop. 
This is a godsend for amateurs like me, who 
simply want to make their pictures look a little 
better. 


Retouching for Dummies 

Eismann, an experienced graphics designer and 
respected teacher, understands that Photoshop 
can be overwhelming for novices and she struc¬ 
tures her book accordingly. 

The first four chapters discuss tone, contrast, 
lighting, and color-all of which can be manipu¬ 
lated fairly easily with dramatic results. Eismann 
doesn’t shy away from lingo, but her clear exam¬ 
ples make the material extremely accessible. You 
won't come away with a deep love of gamma 


correction, but you will learn what it does and 
how to manipulate it. 

After the first four chapters, the book starts to 
get more challenging. Eismann goes beyond 
manipulating sliders and buttons, and starts 
describing how to retouch and repair pho:o- 
graphs—everything from removing unwanted 
spots to repairing a torn portrait. The adventur¬ 
ous reader will especially appreciate the final 
chapter, which explains how to perform digital 
liposuction and correct other undesirable imper¬ 
fections. 

Some of the techniques described in :he 
book are understandably advanced, and your 
mileage will vary. It would have been helpful if 
the book had contained more information on 
how to scan a photograph for optimal retouch¬ 
ing. Nevertheless, the book is easy to fellow 
and encourages experimentation, (t also fills a 
very important niche. Photoshop Restoration & 
Retouching hasn’t restored my artistic aspira¬ 
tions, but it has certainly improved my photo 
collection. 




Information Anxiety 2 

By Richard Saul Wurman 
Oue, 2001, 308pp. 

$29.99 


Curing or Causing Information 
Anxiety? 

Richard Saul Wurman published Information 

Anxiety in 1989, five years before the Internet 
explosion. In it, he described how we were 

being inundated 
with mostly irrele¬ 
vant information, 
and he offered solu¬ 
tions on how to 
make better sense of 
this increasingly 
overwhelming world. 

Information 
Anxiety 2, which is 
disappointing at best. The book shows prom¬ 
ise at first, as Wurman describes the problem 
of information overload, and explains how the 
Internet has worsened it. However, despite the 
book’s title, Wurman isn’t a Luddite. He 
admits that the Internet has had positive 
effects on society, and has the potential for 
achieving even more. 


I NFOHMAFCr ANXIETY 
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Fulfilling this potential requires an 
understanding of what makes people tick, 
and Wurman attacks this question in several 
ways. His own expertise seems to be in infor¬ 
mation design, 

Too Much Information, Not 
Enough Depth 

For the most part, Wurman’s examples are 
excellent and his writing is clear and easily 
digestible. However, the book is mostly 
limited to these soundbites and examples, 
while offering few insights of any depth. This 
is the very thing that Wurman is preaching 
against! Time after time, I found myself 
nodding in agreement and anticipation as 
Wurman outlined the symptoms of informa¬ 
tion overload—poorly integrated information, 
overemphasis on more rather than better, and 
so on—only to be left hanging. 

Part of the book’s problem is that its scope 
is too broad. Wurman seems unable to decide 
whether he wants to write a book on informa¬ 
tion design or organizational management. As 
a result, the coverage of both is superficial. 
This is a real shame, especially when you 
consider some of the great work with which 
Wurman is associated, such as his redesign of 
the California Yellow Pages and his Access 
series of guidebooks. 

Too often, this book reads like a laundry 
list of Wurman’s projects, relevant or not, as 
well as an advertisement for Wurman’s 
Technology Entertainment Design (TED) con¬ 
ference. At one point, the author gives a 
detailed description of his idea for a cook¬ 
book. He then complains that Martha 
Stewart was interested, but unwilling to work 
with him due to disagreements over whose 
name would go on the cover—too much 
information. 

To be fair, several of the tenets that 
Wurman describes have become part of the 
collective intellect over the past few years, 
and the first edition of his book may have 
played a role in this phenomenon. As a whole, 
however, Information Anxiety 2 is too long, too 
broad, and too self-indulgent. >< 

Eugene writes, programs, and consults on a free¬ 
lance basis. He is currently writing a book on the 
history of free software, entitled Software, 
Money, and Liberty: How Source Code Became 
Free. You can reach him at eekim( 3 eekim.com. 


60 


www.webtechniques.ccm October 2001 





















BRAINSTORM GROUP’S 


BusinessintegratioiT 

CONFERENCE SERIES 


The eBusiness Integration Conference Series is the 
leading forum specifically designed to provide business 
and IT leaders with solutions to the full spectrum of 
e-business integration challenges. 

Featuring leading analysts, authors and end user 
case studies, this series details business driven 
strategies, the latest technological advance¬ 
ments, proven "Best of Breed" solutions 
and trends in e-business integration. 


The Changing Face of 
E-Business integration 

B2B strategies have shifted dramatically, 
the importance of operational systems 
and data is greater than first perceived 
and organizational transformation has 
emerged as a critical success factor. 
Companies are learning why incorporating 
integration into the design stage of an 
e-business project is so essential and have 
declared business process integration as 
one of the most critical issues of 2001. 


Last year’s approaches no longer apply. 

Companies must stay current as the e-business 
integration agenda is reshaped on a continuing basis. 

Applying yesterday’s strategies can topple well-intended 
e-business integration plans. Staying current means engaging 
in an ongoing dialog with industry peers and experts - the 
individuals on the cutting edge of e-business integration. 


B2B Integration 
Strategies & Solutions 


Plan to attend the eBusiness Integration Conference 
in order to integrate business partners, processes, the enterprise, 
suppliers, applications and mobile devices. 
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Brainstorm 


EmMsd tty. 


Enter to WIN a 3-Day Conference Pass (a $1495 Value!) 
Visit www.brainstorm-group.com for details 


New York 

September 19-21, 2001 


San Francisco 

October 29-31, 2001 1 





www.brainstorm-group.com T: 508-393-3266 F: 508-393-8845 
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Your Talents 
to the Next Level 



DV Expo provides practical solutions 
in digital video, Web video, and 3D. 

Learn how to use the evolving tools and 
technologies essential to dynamic media 
production, postproduction, and delivery. Acquire 
new tips, tricks, and groundbreaking techniques 
to power your digital video projects. 


Conference: December 3-7, 2001 
Expo: December 4-6, 2001 
Los Angeles Convention Center 
Los Angeles, CA 

Go to DVexpo.com for complete conference details, 
including early bird discounts, FREE expo passes, 
and a chance to win a $50,000 professional video studio. 

DVexpo.com 

(Contact by phone: 415-947-6135) 

Register* TODAY and prepare to masfe r new sfcillsl 



Platinum Sponsors: Event Sponsors: 
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A world of countless digital appliances can be tough to manage; but some 
folks at HP Labs are betting that Cooltown will help us all get along. 


f~ 


ACCESS 



Appliances Just Got Cooler 


A Discussion with 
Hewlett-Packard's 
Jeff Morgan 


Imagine a world in which you could control a range of 

smart appliances by using a single, simple control device, 
such as a mobile phone or a PDA, with interfaces based on 
existing Web standards. That just scratches the surface of 
Cooltown (www.cooltown.com), Hewlett-Packard’s online 
demo of how wireless technologies can be applied to every¬ 
day life. Web Techniques asked jeff Morgan, Cooltown’s tech¬ 
nical director, to give us a tour of the future. 


JM: The appliance connectivity technology within the 
Cooltown infrastructure was developed long before XML or a 
formalized RPC model over HTTP like SOAP. This idea was 
always part of the architecture though, so we were able to 
take immediate advantage of the feature set provided by 
SOAP. The same is true for UDDI and other Web based serv¬ 
ice models. 


WT: So what does Cooltown’s approach have that PC 
applications don't? 


WT: OK, but why choose a Web-oriented model for control¬ 
ling home appliances? 

jM: In general, distributed computing technologies have 
JM: The PC has a lot of utility, but one big problem: you have traditionally had severe deployment issues because they 
to be near it. The notion of mobility breaks the PC’s applica- often required complex infrastructure. Cooltown, on the 
tion model. A mobile person is likely to encounter many other hand, is based on a pervasive connectivity model with 
different services and devices that he or she may need to a thriving community. The Web paradigm provides a few 
interact with or manipulate. It’s impossible for all of that things: a basic connectivity model, HTTP request/response, 
software to be preloaded onto a device that’s carried around. Wc:b markup for describing user interfaces. XML provides the 
Instead, the Cooltown model supports two key generic use- basis for more complex data exchange between processes 
models: Web browsing and beaming. Users with very simple and appliances. We feel that any Web-based technology is 


devices can interact with Cooltown environments, and 
they’re able to interact with a myriad of other devices—each 
of which might offer a different user interface. 


ve y complementary to the infrastructure we’re developing. 


WT: So my toaster would have a Web server inside it? 

JM: Well, the only reason to put a Web server in something 


WT: Realistically, how long before this Idea actually 
becomes feasible? 

JM: I could build a system today. I could keep a list of my 
favorite music on my PDA and beam it to an Internet radio 
like a toaster is so that it has some kind of soft control inter- that might be at home, work, or a friend’s home. I could 

use the same PDA to control the playback of the music. Or, 

I could expand my collection by picking up new music that 
I download from the music store, or have beamed to me by 
a friend. The infrastructure is realty in place for adoption 
now. To make adoption widespread, though, there needs to 
be a good set of applications or solutions built on the 
technology. 


face. In general, you need to look at the operation and use 
mode of the device before you decide to put a Web server in 
it. But the roots of Cooltown do stem back to research we 
were doing in 1995, when we were looking for an infrastruc¬ 
ture to support, control, and communicate with and 
between home appliances. We had two options: Define our 
own communication architecture from the ground up, or re¬ 
use a standard connectivity model to do the same thing. We 
happened on the notion of embedding Web servers into 
small devices and using those servers to provide Web pages 
people could use to control the devices. 

WT: You’ve come a long way from there. 

JM: The Web in those days was really about Web pages; 
no real services were in place. We Web-enabled a number 
of different home appliances, from TVs, to VCRs, to home 
lighting controls, to blood sugar monitors. From there, 
we developed a full appliance infrastructure that sup¬ 
ported not only human-to-appliance interaction via Web 
pages, but also appliance-to-appliance interaction using 
HTTP messages. 

WT: How does that compare to some of the technologies 
from the Web services model, like SOAP for instance? 


WT: So how could a developer get started right now? 

JM: For Web developers, the learning curve should be short- 
they can take advantage of the Web development tools they 
already use in conjunction with Coolkit tool set, part of the 
CnolBase release (www.cooltown.com/dev). The CoolBase 
platform consists of several components. These include soft¬ 
ware for enabling smart, connected Web devices; software 
for representing people, places, and things, and their contex¬ 
tual relationships; and some supporting hardware and soft¬ 
ware elements. There are also sample applications to 
illustrate the use of these various elements; for example, the 
Ir ternet radio we demonstrated at the O’Reilly Open Source 
show. By putting CoolBase into the open-source community, 
we hope to attract a community of developers that will help 
define and refine the future applications and solutions for 
Cooltown. >< 
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We have a 98% Client Retention Rate. Why? We average support hold times of less 
than a minute and real people are here for you 24/7/365. We offer you the latest 
technology at lower prices than our competitors charge for yesterdays leftovers. 
We provide generous bandwidth allowances. We give you a 99.99% network uptime 
guarantee. We provide you a great network and great support at a great price. 



A Rock Solid, Fully Redundant Network built on a Dual OC192 Backbone & a Gigabit Fiber Optic Infrastructure, featuring 
Juniper Routers & Cisco & Extreme Switches. Utilizing direct access to each of the major backbone providers, powered by a unique 
network infrastructure and proprietary routing technology, we deliver you mission critical speed and reliability. And we back our net¬ 
work with our friendly, knowledgeable support staff, 24 hours a day, 7 days a week. State of the Art DataCenter - Our world-class 
facilities feature raised floors, a temperature & humidity controlled environment with HVAC temperature control systems, with separate 
cooling zones & seismically braced racks. They offer a wide range of physical security features, including state-of-the-art smoke detec¬ 
tion and fire suppression systems, motion sensors, 24x7 secured access, as well as video camera surveillance & security breach 
alarms. Within these facilities, DataPipe delivers high levels of reliability through a number of redundant subsystems, such as multiple 
fiber trunks from multiple sources, fully redundant power on the premises, and multiple backup generators. 


FEATURED DEDICATED SERVER SOLUTIONS: 

Your choice of Windows 2000 , FreeBSD or RedHat Linux 
Servers Can Be Configured to Your Specifications, Call for a Quick Quote! 

FAST START - S295 a month, 30GB IDE Hard Drive, 256 MB RAM, Intel P3 
1GHz Processor, 50 GB Data Transfer, 10 IPS 

BUSINESS - S395 a month, Dual 9GB SCSI Hard Drives, 512MB RAM, 

Intel P3 1GHz Processor, 75 GB Data Transfer. 20 IPS 

CORPORATE - $520 a month, Dual 18GB SCSI Hard Drives, 1GB RAM , 
Intel P3 1GHz Processor, 100 GB Data Transfer, 30 IPS 

ENTERPRISE - S995 a month, Compaq DL380. Four 18GB SCSI Drives, 
1GB RAM, Dual Intel P3 1GHz Processors. 200 GB Data Transfer, 50 IPS 

FEATURED SUN SERVERS: 

NetraTI - S495 a month, 440 Mhz UltraSPARC Processor, 18GB SCSI 
Hard Drive, 512 MB of RAM. 200 GB of Data Transfer, 20 IPS 


Featured Virtual Hosting Solutions 
SyreShopping - $49.95 a month 

* Powerful E-Commerce Software, SOL Server Backend 

* 300 MB of Disk Storage, 30 e-mail accounts, 20GB of Data Transfer 

ADVANCED WINDOWS 2000 HOSTING PLAN - S99.95 a month 

* 500 MB of Disk Storage, 50 GB Data Transfer, 50 E-Mail Accounts 

* Servers are limited to 20 websites for maximum performance! 

* FP2002, index Server, AspUpload, ASPMall. SAFileup, AspGMail included 

VIRTUAL HOSTING RESELLER PLAN 

* Just $69.95, $29.95 Set Up Fee, 250 MB of storage, 20GB data transfer 

* Each of your clients can have their own domain & ip number 

* Your Setup tee includes your first domain, additional domains have a Si 4,9-5 set up fee each 

COLD FUSION PLAN - $99.95 a month 

* 200 MB Disk Storage, 30 GB of Data Transfer, 40 E-Mail Accounts 

* Includes 50 MB of SOL Storage, Lightning Fast Raid 10 Servers 

DataPipe PRIVATE SERVER - SI49.95 a month - Ideai for Resellers! 

* One GB of Storage, no limits on websites or e-mail accounts 

* Brand Your Own Services, even provide your own DNS Services 

* Easy to Use Control Panel for you and your users! 


Managed Solutions Available * Clustering. VPN & Load Balancing Available * 24x7x365 Proactive Monitoring & Support 



www.datapipe.com Call Toll Free 877-773-3306 





















t afford to 


Interland's Web hosting solutions allow your business to run better, faster, smarter 
and more reliably. And thanks to our recent merger, we're now the leading provider 
of business-class hosting for small to medium size businesses. By hosting with 
Interland, you can rely on us to provide you with all of the online tools you need 
to successfully run your business. Everything from Web hosting to e-commerce to 
marketing support. So get all of your parts in place. By getting Interland. 



For more information call 1-866*279-0491 or visit us @ www.lnterland.com. 
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BUSINESS-CLASS HOSTING 
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MANAGED SERVICES APPLICATION HOSTING 


E-COMMERCE PROFESSIONAL SERVICES 










